Q L H A C K E R ' S J O U R N A L =========================================== Supporting All QL Programmers =========================================== #29 October 1998 The QL Hacker's Journal (QHJ) is published by Tim Swenson as a service to the QL Community. The QHJ is freely distributable. Past issues are available on disk, via e-mail, or via the Anon-FTP server, garbo.uwasa.fi. The QHJ is always on the look out for article submissions. QL Hacker's Journal c/o Tim Swenson 2455 Medallion Dr. Union City, CA 94587 swensont@geocities.com swensont@jack.sns.com http://www.geocities.com/SiliconValley/Pines/5865/index.html EDITOR'S FORUMN I'd like to thank Per Witte for providing pretty much the core part of this issue. I saw the Filename Parser program posted to the ql-users mailing list and thought it would be good material for an article. I contaced Per and he was willing to write something up. The length of the article was more that I expected, but I'm not complaining. I'm still pressing forth with the Qlib Source Book. I've been delayed by a broken hand (out of cast and getting better). I'm pushing myself to get something out soon, even if only about 10% of what I would like to see in the Book. I'm looking for information about the history of Qliberator (the various releases) and any bugs in the current release (3.36). If you have some info, please send it to me. My other project is the QL PD Documentation Project (I'm using PD loosly to mean freely available). I'm trying to collect (in one location) a variety of released information about the QL and QDOS. Currently I have a number of documents, including parts of the QL User Guide, a couple of Tutorials, and a few articles that I've scanned in from IQLR. Please take a look at my web site to see what I've gathered. It is: www.geocities.com/SiliconValley/Pines/5865/ SHELLING OUT TO SUPERBASIC The QL is unique in how QDOS and SuperBasic are sort of rolled into one. Just as we can only have one copy of QDOS running, we can only have one copy of SuperBasic running (OK, I know this is not true for Minerva). In a traditional operating system model, there is the OS and the Shell. The Shell is a user interface to the operating system. In UNIX there are many different shells; Boune (sh), C (csh), Korn (ksh), etc.. In MS-DOS the shell was in COMMAND.COM. Even with Win95, there is a MS-DOS Shell still available. In most OS's, there can usually be more than one shell ruuning at one time. Application programs can fire off another Shell and run one or more commands. The new shell becomes a child process of the calling application. All of this comes about because the shell is nothing more than another user application. On the QL we are limited in that only one copy of SuperBasic can be run at one time at it must be Job 0. In QDOS, user applications can call other programs. An editor could call C68 to compile a program that was just edited and saved. But when the command needed to run is a built-in SuperBasic command (which includes loadable extensions), thete is no way to shell out to SuperBasic to run the command. I ran into this problem when I wanted to run Qliberator from within MicroEmacs. MicroEmacs allows for executing programs from MicroEmacs, but I could not run all the command for QLiberator that I needed. To explain, here are the steps to run QLiberator: 1. Load the program into SuperBasic 2. Call "Liberate" to create a working file. 3. Execute Qlib Here is a the steps in SuperBasic terms: LOAD FLP1_TEST_BAS LIBERATE FLP1_TEST_BAS EXEC QLIB Since I can't call LOAD or LIBERATE from within MicroEmacs, I thought I could write a SuperBasic program and then compile it. This may work with LIBERATE, but LOAD is not a command that QLib will compile. I thought I was stuck until I started reading the HotKey System II manual. It was there that I ran across the two commands, HOT_CMD and HOT_DO. HOT_CMD assigns to an ALT key, a command that will be entered in the SuperBasic window (#0). It does not have to be an executable program, but can be a SuperBasic statement or a resident command ( like TKII or DIYtoolkit). HOT_CMD "picks" the SuperBasic interpreter to the top and the command is sent to SuperBasic to be run. The HotKey System II was designed to be driven by the user. It is the user typing in a HotKey sequence that puts the System into motion. There in a HotKey command that allows automation of the HotKey System: HOT_DO. HOT_DO tells the System to implement a HotKey. HOT_DO('a') is the same as the user hitting the ALT-a keys. Using HOT_CMD and HOT_DO in conjunction, the programmer can perform actions just as if they were "shelling" out to SuperBasic. In the case stated above, here is how I would use the two commands to automate using Qliberator: 10 file$ = "flp1_test_bas" 20 ERT HOT_CMD('a','LOAD file$') 30 HOT_DO('a') 40 ERT HOT_CMD('b','LIBERATE file$') 50 HOT_DO('b') 60 EXEC QLIB_OBJ : REMark Just EXEC it. I can get away with having LIBERATE in a compiled QLIB program, but not LOAD. Here is a way to get around this limitation. PARAMETER PASSING TECHNIQUES IN S*BASIC Per Witte In QHJ#24 and #25 there were articles on parameter passing techniques (By Tim Swenson and by Peter Tillier, respectively). I won't (too much) re-hash what's already been said (this article was prepared before I was aware of the other two), but look at the matter in a practical way, that may suit some readers. I'm using SBASIC for this article - the enhanced SuperBASIC interpreter that comes with SMSQ and SMSQ/E. SBASIC behaves somewhat differently to SuperBASIC with respect to variable handling, and has some desirable features, not available in standard SuperBASIC. Where this affects the subject at hand I shall try and point out the differences. However, I am presently not able to test my examples in SuperBASIC, so some incompatibilities (ie, bugs) may be found. Please always ensure that the techniques described work with your version of S*BASIC before relying on them in any way. <<< Value or Reference ? >>> There are two ways of passing parameters to functions and procedures in S*BASIC: by value, which is perhaps the "intuitive" method; and by reference, which will be the main focus of this article. Passing parameters by value is (what we may THINK) we normally do. RUNning the program fragment below 10 test1 1,2,3 99 : 100 defproc test1(a,b,c) 110 print a,b,c 120 enddef 130 : would print "1 2 3" on your screen (with any luck!). And of course: 10 x=1:y=2:z=3: rem Assign values to some variables 20 test1 x,y,z: rem and use these instead. does the same. But a small modification of test, test2, shows what's really going on: 99 : 100 defproc test2(a,b,c) 110 print a,b,c 120 a=a+a:b=b+b:c=c+c:rem Double all parameter variables! 130 enddef 140 : The new harness: 10 x=1:y=2:z=3 20 test2 x,y,z 30 print x,y,z 1 2 3 <- prints out x, y, & z, as expected 2 4 6 <- but what's going on here? We set x,y,z to be 1,2,3! By altering the values of the parameter variables a, b, & c, we cause a change to the calling variables x, y, & z too. This is a call by reference; we don't pass to the procedure merely the values the variables contain, instead we refer to the original variables - a, b & c ARE x, y, z, only by a different name. As you will appreciate passing parameters by reference is not always desirable. In fact, unless you specifically want to do so, it could be a real pain: You can see how a procedure might easily (and unintended by you) alter its parameters, and thereby variables external to itself. To avoid this you can apply the rule never to alter a procedure's formal parameters within the procedure; or you must, pass your variables by value only. But how to do that? If you typed: test2 1,2,3, what do you think happens to a, b, c? Well, their values are simply thrown away when the routine returns. By extension, the same holds good for: test2 p+1,q+1,r+1, having previously set p, q, r to some value. Anything other than a variable is considered an expression in this context, and can therefore not receive a return value. Thus: 10 x=1:y=2:z=3 20 test2 (x),y+0,z^1 30 print x,y,z 1 2 3 <- prints out x, y, & z, as expected 1 2 3 <- prints out x, y, & z, as expected(?) Good programming practice would avoid altering the parameter variables - copy their values into LOCal variables instead! In my opinion, test2 (x),(y),(z) gives the clearest indication of intent, besides being (marginally) faster than say, x+0,y+0,z+0, and so is a good convention to adopt for call by value. <<< Coersion >>> There are other "oddities" about the way parameter passing works. For example: 10 x$='a':y=3:z%=3 20 test3 x$,y,z% 30 print x$,y,z% 99 : 100 defproc test3(a,b,c) 110 print a,b,c 120 a = a & a : b = b / 2 : c = c / 2 130 enddef 140: RUNning this program produces: a 3 3 <- x$ y z% aa 1.5 2 Not what you'd think, looking at the formal parameters a, b, c! However, this can be very useful, as will be shown later. Things to watch out for though are: You may assume that the formal parameter decides the type, when it actually is the calling parameter that does so! An example might be: 10 x=1:y=4 20 fast_test x,y,10 99 : 100 defproc fast_test(a%,b%,c%) 110 rep loop 120 a%=a%+1:b%=b%+b%:c%=c% div 3 130 if c%=0:exit loop 140 endrep loop 150 enddef 160 : You're expending all this effort optimising fast_test; changing out floating point variables with integers, and the like. You need not have bothered! This is what it's actually doing: 120 a=a+1:b=b+b:c=c div 3 In fact everything runs in (relatively) slow floating point! The correct moves are: 10 x%=1:y%=4:z%=10 20 fast_test x%,y%,z% will pass integers to fast_test, and/or 100 defproc fast_test(a%,b%,c%) 110 loc r%,s%,t% 120 r%=a%:s%=b%:t%=c% etc.. What you then use in the formal parameter list is irrelevent (except as a reminder as to what the correct type should be!) Also copying the parameters into LOCal variables will coerce the parameters back to the desired type. In tk2 there are commands to test the parameters that are passed to a procedure: PARTYP tells you the actual parameter type (nul (never nul in SBASIC), string, float, or integer) and PARUSE whether the parameter is an array or not. <<< Returning Values through the Parameter List >>> A "by-product" of the ability to pass parameters by reference, is that we can actually return more than one value to the calling program. Both functions and procedures can be used for this. I find the function_error = Function(update-able parameter list) construct particularly useful, as I hope to show. Below follows a commented listing on a filename parsing utility for S*BASIC that hopefully illustrates the technique: 1 PRINT,'(Simplyfied) Filename Parser' 2 REMark ©PWitte 1998 3 PRINT,!!!!!'PD - No Warranties'!!!!! 4 : 5 dfnm$='win1_bas_util_fnm_ParseFnm_bas' 6 er=ParseFnm(dfnm$,ddev$,ddir$,dnm$,dext$) 7 PRINT\\'Fnm:'!dfnm$\\'Dev:'!ddev$\'Dir:'!ddir$\ 8 PRINT 'Nme:'!dnm$\'xt:'!dext$\'Err:'!er 9 STOP 10 : Above. The first part of the program gives some information, and shows and example of usage. 32724 DEFine FuNction ParseFnm(f$,v$,d$,n$,e$) This part of the program is the object of the enterprise; the file name parser itself. Due to the nature of the QL's file system (FS), it is impossible to determine how much of the latter part of the name is filename and how much is directory name merely by inspecting the filespec. You have to actually open the file (or its directory) to find out. The function does this, and then breaks up the filespec according to a mixture of known facts and assumptions (ie, it's not foolproof!) It puts the different sections into the supplied variables, and returns ok. The function is defined as a floating point function, even though its main task is to manipulate, and you might say, return text. In this, simplified, version any values pre-supplied in v$..e$ are overwritten. The only parameter you should supply is the f$ (for Full Filespec) This is (more or less) expected to be in the form of: key: <> = name; | = or; [] = optional (0..1); {} = repeated (0..) = directory separator, '_' = extension separator, '_ | .' (SMSQ/E) = {} [[[]]] 32725 LOCal c,t,p%,i% 32726 REMark Split filename into components 32727 c=FOP_DIR(f$):IF c<0:RETurn c FOP_DIR is a function (introduced in Toolkit I/II (tk2), by Tony Tebby, and included with many disk interfaces, and in SMSQ*). It tries to make the best of the information supplied, and will open the first directory that matches the first part of the filespec. So if you have a directory called 'win1_asm_' (but none called 'win1_asm_prg_...') and you did a ERT FOP_DIR(win1_asm_prg_temp) the function would open directory 'win1_asm_' taking the rest of the filespec to be a filename! 32728 d$=FNAME$(#c):CLOSE#c FNAME$ (also a function from tk2) returns the name of any file, also directory files. So, continuing our specific example above, d$ (for Directory) would now contain 'asm' - Note the device name is not returned. 32729 IF LEN(d$) THEN FNAME$ did return (at least the first) part of the directory name, eg 'asm'. 32730 p%=d$ INSTR f$:IF p%=0:RETurn -7 If the filename returned by FNAME$ is, after all, not in the filespec return the error Not Found. (This would be the case if you tried to: DATA_USE 'win1_asm' ERT FOP_DIR(#3;'abc_test') FNAME$(#3) would then return 'asm') 32731 d$=d$&'_' If d$ _is_ a substring of filespec, append the filename separator (as the last one is not stored in the directory file). 32732 ELSE At this point d$ is ''. This could mean that had been specified; that no matching directory was found (eg had we specified 'win1_prog_temp_..' and there was no 'win1_prog_..' ); or that something was wrong. 32733 p%=('_' INSTR f$)+1 Do a primitive test on the filespec to see if it contains a devicename, eg 'win1_..'. 32734 END IF 32735 v$=REMV$(p%,LEN(f$),f$) v$ stands for deVicename. v$ gets set to the first part of filespec, up to the first underscore. 32736 IF LEN(v$)<3:RETurn -12 Better would have been: 32735 IF p%<3 OR p%>5:RETurn -12 This version of the filename parser doesn't support networked drives, so: 32735 IF p%<>5:RETurn -12 would be correct here. Then: 32736 v$=REMV$(p%,LEN(f$),f$) Tests whether the first part of the filespec is a possible devicename. (Devicenames can only legally be 3, 4, or 5 charcters long, as in: 'S7_', 'n63_', 'ram2_'. Anything other is an error. Further tests should be done here to determine whether v$ is 'legal' device name, but there is no easy way of knowing for sure. (Try: OPEN_NEW#3; 'flp7_test':PRINT FNAME$(#3) and see what you get (presuming you don't have an flp7_ ;) 32737 IF p%+LEN(d$)=LEN(f$) THEN 32738 n$='':e$='' We allow filespec to be incomplete from devicename down. In the case above filespec == device name & directory name, ie there is no filename and no extension. 32739 ELSE 32740 n$=REMV$(1,p%+LEN(d$)-1,f$) There is a name (and possibly an extension). Let n$ (for fileName) hold it for now. 32741 p%=0 32742 FOR i%=LEN(n$) TO 1 STEP -1 32743 IF n$(i%) INSTR '_.':p%=i%:EXIT i% 32744 END FOR i% Here we just check filename from the end of the string for the first '.' or '_' it encounters. This, it decides, will be the extension. 32745 IF p%=0 THEN 32746 e$='' No extension found. 32747 ELSE 32748 e$=REMV$(0,p%-1,n$) 32749 n$=REMV$(p%,99,n$) Slice filename into name part and extension part. 32750 END IF 32751 END IF 32752 RETurn 0 Return OK. 32753 END DEFine 32754 : The final part of the program is a help-function REMOVE$ (shortened to REMV$ in its S*BASIC incarnation) all it does is to simplify string slicing by encapsulating all the error trapping. It won't be looked at here. 32755 DEFine FuNction REMV$(fr%,to%,str$) 32756 IF fr% < 2 THEN 32757 IF to% >= LEN(str$):RETurn '' 32758 RETurn str$(to% + 1 TO LEN(str$)) 32759 END IF 32760 IF to% >= LEN(str$) THEN 32761 RETurn str$(1 TO fr% - 1) 32762 ELSE 32763 RET str$(1 TO fr% - 1) & str$(to% + 1 TO LEN(str$)) 32764 END IF 32765 END DEFine 32766 : The weird numbering scheme is to enable the function to be easily MERGEd into a larger program that needs it; linenumbers <100 can be removed after testing. <<< Arrays as Parameters >>> Also arrays are passed by reference; when you supply an array parameter you are allowing the procedure to access your actual array. The same rules described above regarding type coercion also apply to arrays. Unfortunately, S*BASIC provides only limited "mass" operations (for lack of a better term) on arrays though you can pretty much slice them up any which way you choose. This comes in handy if you want to write your own mass-ops in S*BASIC or machine code. You can't do a = b with arrays in S*BASIC but you can write your own EQU a TO b, which does exactly the same (see commented listing of EQU below). 1 DIM a$(2,2,2,2,6),b$(2,2,2,2,8) 2 DIM a(2,2,2,2),b(2,2,2,2) 3 FOR i%=0 TO 2:FOR j%=0 TO 2:FOR k%=0 TO 2:FOR l%=0 TO 2:a$(i%,j%,k%,l%)='L'&i%&j%&k%&l% 4 count%=0 5 FOR i%=0 TO 2:FOR j%=0 TO 2:FOR k%=0 TO 2:FOR l%=0 TO 2:a(i%,j%,k%,l%)=count%:count%=count%+1 6 : Above. Initialise a few test arrays (use plenty of dimensions :) Note you'll have to modify all integer FOR-loops to make this program run under plain QL SuperBASIC! 10 CLS:PRINT a$,\ 12 er=EQU(b$,a$) 14 CLS#0:CLS#2:PRINT#2;b$,\:PRINT#0;er 16 BEEP 2000,20:PAUSE 18 CLS:PRINT !a!\ 20 er=EQU(b TO a) 22 CLS#2:PRINT#2!b!\:PRINT#0;er\ 24 BEEP 2000,20 26 : Test harness. Edit to taste. 100 REMark EQU SBASIC function to 101 REMark EQUate two arrays of the 102 REMark same dimensions and type 103 REMark Requires tk2 or equivalent 104 : 105 REMark PWitte, August 1998 106 REMark For "educational" purposes only 107 REMark Use at own risk. No warranties! 108 : Can't say you haven't been warned! 1000 DEFine FuNction EQU(a,b) The idea is to equate array a with array b in a reasonably rational manner, while demonstrating some of the niceties of parameter passing techniques using arrays, at the same time. The first thing to note is that EQU will handle any type of array ie, interger, sting, & float - although the parameter list only shows float! Also, any number of dimensions are handled. The only provision, in this implementation, is that they are of the same type and have the same number and size of dimensions (except string, in the last dimension). 1010 LOCal er This nice little feature of "inheritance" is not documented anywhere, as far as I know: A LOCal variable defined in one procedure will remain local in any procedure called by that procedure, unless the variable has been "re-defined" by a subsequent use of LOCal. Here the local variable, er (error flag) is set in EQU, the calling function, and modified in EQN/EQS. Yet a variable er, defined in the initial code, outside any procedure body, would retain its original value. The only danger lies in that if the same sub-routines were to be reused by another procedure, you may forget to declare it as LOCal in the calling procedure and end up mysteriously modifying a GLOBal variable instead! It certainly saves the the repeated overhead of stacking a local variable for each recursive call to EQN/EQS here, as would be the case if we defined er EQN/EQS. 1020 IF PARTYP(a)<>PARTYP(b):RETurn -15 1030 IF PARUSE(a)<>PARUSE(b):RETurn -17 Checks whether paramerters are arrays, and of the same type. The error checking here is not foolproof. 1040 er=0 1050 IF PARTYP(a)=1 THEN 1060 RETurn EQS(a,b) 1070 ELSE 1080 RETurn EQN(a,b) 1090 END IF String arrays must be handled slightly differently to numeric ones, in that the last dimension is the string itself. It might be possible to find a universal algorithm, to handle numbers and strings, but it makes sense to use the built-in mass assignment features and copy whole strings at once, rather than byte by byte. So the string and numeric sides have been implented as separate functions. Another advandage is that this offers the opportunity to optimise them (how ever slightly) for their different uses. 1100 END DEFine 1110 : 2000 DEFine FuNction EQN(a,b) 2010 LOCal i% Another reason for separating out these sub-routines as functions, is that we can take advantage of the interpreter's excellent array slicing abilities, as in line 2070. 2020 IF DIMN(a)<>DIMN(b):RETurn -4 Every dimension has to match in size. This test will be performed before any processing takes place. The alternative would be to use a special sub-routine. 2030 IF DIMN(a(0))=0 THEN Look-ahead: If the next dimension is past the last, then this is the dimension we can work with: 2040 FOR i%=0 TO DIMN(a):a(i%)=b(i%) Copy this dimension from b to a, element by element. This also terminates recursion at this level. 2050 ELSE 2060 FOR i%=0 TO DIMN(a) 2070 er=EQN(a(i%),b(i%)):IF er<0:EXIT i% 2080 END FOR i% More than anything, a function like EQU wants speed, so loops have been specialised, reducing the overheads of test & branch. The use of recursion is almost necessary, thanks to the parser's array-slicing abilities! 2090 END IF 2100 RETurn er The error-checking stuff is not strictly necessary in a programming toolkit - error-checking is performed at the program level - but it's easier to delete it than add it when needed (eg during program development). 2110 END DEFine 2120 : 3000 DEFine FuNction EQS(a,b) Pretty much the same as for EQN above, but optimised for string operations. 3010 LOCal i% Note that the same technique cannot be used for i% as er. i%'s value will be different at different levels of recursion ie, i% has to be saved between levels. 3020 IF DIMN(a)<>DIMN(b):RETurn -4 3030 IF DIMN(a(0,0))=0 THEN This arrangement was actually a bug, as no comparison is made on the last dimension. However, as the interpreter doesn't complain and simply ignores any supernumary characters, I thought I'd leave it there as a feature. Ie, DIM a$(5,10),b$(5,8): er=EQU(b$,a$) works, though any strings longer than eight characters will be truncated. 3030 IF DIMN(a(0,0))=0 THEN Remember a() refers to an array of type string! Operations can be performed at a higher structural level, so we terminate recursion one level up. 3040 FOR i%=0 TO DIMN(a):a(i%)=b(i%) Copy one whole string at a time. 3050 ELSE 3060 FOR i%=0 TO DIMN(a) 3070 er=EQS(a(i%),b(i%)):IF er<0:EXIT i% 3080 END FOR i% 3090 END IF 3100 RETurn er 3110 END DEFine 3120 : Genuine bug/incompatibility reports, suggestions and comments welcome. Send to pjwitte@knoware.nl