Modularization of source code

Tdarcos · Post by **Tdarcos** » Tue Oct 22, 2019 6:31 pm

This article deals with the history of why older source code tends to be one large file ("monolithic") instead of broken into several smaller files ("modular"). If you're not interested in a history lesson you can skip this article.

I want to talk about a feature in programming in general - and interactive fiction in particular - that has more-or-less become available as a result of better editing tools and the development of source code modularization through tools like "make."

How many of you have ever read some of the older text adventure source code files such as DUNGEON (the predecessor to Zork.) Or the "grandfather" of all interactive fiction, the one that started it all, Crowther and Woods' Colossal Caves Adventure? If you don't know Fortran, probably never. Or if you were lucky, you had the opportunity to read it in one of the translations such as C or an IF authoring tool such as Hugo, TADS, AGT or something else.

But to paraphrase Klingon chancellor Gorkon in Star Trek VI: The Undiscovered Country, "You have not experienced programming IF until you've read Colossal Caves in the original Fortran." We live in, as far as software resources are concerned, "a post-scarcity economy." Computers are inexpensive and powerful; memory is plentiful, vast, and cheap, disk space is so inexpensive it's almost free, and displays are so sharp and clear for graphical images that it's almost heartbreaking.

But a look-back at 1970s technology reminds us of where we came from and resource limitations we had to live with. Colossal Caves ("CC") had to handle everything in upper case because mainframes hadn't shifted out of the Punched Card era (i explain that term later) when everything was in upper case to make sorting faster and save disk space. Colossal Caves compressed text input into 6-bits per character in order to fit 6 characters in a 36-bit word (PDP-10), or 5 characters in a 32-bit word (IBM 360/370, minicomputers). Memory was scarce and very expensive. Have you bought memory lately? A 4-GB memory rod might cost about $30, for which you are getting about 4 million K of ram. Back when CC was written, memory cost about $1,000 a K. Literally a dollar a byte. This meant computers didn't have a lot of memory and what they did have was precious.

Well anyway, because tools for modularizing source code were very limited, a lot of programs were written on coding sheets and placed, one line of 80 characters at a time, into dollar-bill-sized pieces of 100-pound paper called "punched cards." While you could break a program into subroutines, usually you had to put the whole program together as one monolithic block of code in order to submit it for compilation, and, hopefully (if you hadn't made any mistakes), execution. As people moved to terminals it became possible to save source code on the computer, but again, like everything else, disk space too was expensive. A disk drive which held 100 megabyte removable packs was the size of a washing machine or dishwasher today, and cost about $27,000 in 1970s dollars. Replacement packs were about a foot tall, the circumference of a dinner plate and cost $700.

Do you remember older PCs where every file took a multiple of 4K no matter the actual size? On larger files wasting as much as 3K wasn't that bad, but using 4K for a 300-byte file was painful. The same problem existed on mainframes but it was worse because, as noted above, disk space was lots more expensive. Creating one 75K or 200K source file used a lot less (very expensive) disk space than 50 or 100 small source files. Also, since test editors didn't support working with multiple files at once, working with one huge file was easier than lots of small ones.

Also, Fortran 66 - and some other languages used back then - didn't really have support for segmented source broken into separate files. IBM Fortran IV did not have the equivalent of the IINCLUDE statement, and neither did Fortran on the Dec-10, which is where CC originated. Looking at other languages, while COBOL had the COPY verb, you had to separate COPY source into a copy library, which was extra work. And still, the editing tools back then still didn't support editing multiple files simultaneously. If you can't quickly look from place to place in order to understand what is happening in the program you're working on, your only option is to merge everything together.

The tools we have now and the resources available make splitting large programs into many separate files trivial and the effective cost is so negligible as to round to zero. You have to get to levels of applications like the Linux kernel, Firefox or Apache to get into "serious" disk space usage levels. Back then, a serious on-line application might have 100,000 lines of code and 300 screens. Today, "serious" means at least 10 megabytes or 2,500 separate files.

So anyway, that's the history of why source code files tended to be monolithic instead of modular. I'll go on to my issue in a following article.

AArdvark · Post by **AArdvark** » Tue Oct 22, 2019 7:05 pm

you can skip this article.

Sticky, please

Tdarcos · Post by **Tdarcos** » Tue Oct 22, 2019 7:14 pm

Now that I've done a separate article on the history of large source code files, I can explain my issue and hear your comments. Obviously, I will be discussing working with Hugo code but the underlying facts probably apply to any IF authoring system.

I think the easiest way to handle developing a Hugo application, is to create a "primary" or "main" file, that includes each subsidiary file used to construct the game being developed. Since Hugo has certain position-dependent declarations that must be done first - similar to COBOL requiring a program to start with the IDENTIFICATION DIVISION - those cme first, library files then the application-specific ones. Then the specific verb routines and any "replace" of system procedures, which is similar to overriding a virtual method in an object-oriented language.

At this point compilation reverts back to the main file to provide the "main" routine that runs every turn, as well as system-level overrides such as triggers and events. And then the compiler translates this hopeflly into a workng game.

So one of the things i want to ask is how you break up larger regions and segments? For example, in Tripkey, each region was one file. There was one file covering the rooms in the house, another for Viridian's Department Store, one for the casino, and so on. In Teleporter Test, the entrane room was one file, and because the warehouse was so massive (a 10x10 room area on the ground floor, plus entrances and paths around the building), each floor of the building (basement, ground, second, and roof) were separate files.

Now, what I would do when designing an area I would initially "stub" it out, creating a single room to enter and exit (or several if the region had multiple entrance points). Once I knew this worked, I would create a new file, "hive" off the room or rooms representing the target region, then replace the hived-off code with an include statement of the new file. Then work could either continue in the other area leading up to the new region, or fleshing out the stub routines in the new region.

So, what do you think of this method and how do you handle separate regions in an interactive fiction application?

bryanb · Post by **bryanb** » Tue Oct 22, 2019 10:06 pm

I feel like the content in this thread could and should be presented in a series of essays and theoreticals hosted on Reviews From Trotting Krips. Consider this lineup:

"Gettin' Modular Wit It"

"I Say Punched Cards, You Say Punch Cards: A Very Tomatoey Crowther & Woods Retrospective"

"This Is How I Hugo. How Do You Hugo?"

The Webby Awards will surely rain down like manna from heaven.

Tdarcos · Post by **Tdarcos** » Wed Oct 23, 2019 4:24 am

bryanb wrote: Tue Oct 22, 2019 10:06 pm I feel like the content in this thread could and should be presented in a series of essays and theoreticals hosted on Reviews From Trotting Krips. Consider this lineup:

"Gettin' Modular Wit It"

Yeah, sure. Either copy things over or PM or e-mail me on how to login and post messages.

bryanb wrote: Tue Oct 22, 2019 10:06 pm "I Say Punched Cards, You Say Punch Cards: A Very Tomatoey Crowther & Woods Retrospective"

I think the terminology is a little different because sometimes we used both terms depending on what we had done. At college I would go to the campus bookstore and either buy a deck of 100 punch cards for $1 or $2, or a box of 500 or 1,000 for maybe $5 or $10, I forget. If you could afford it - and remember, most college students had very little money then and the idea of throwing credit cards at students to get them used to the 'debt treadmill' was years in the future - having a box of cards gave you an advantage of an easy place to store unused punch cards or used punched cards. As opposed to carrying a used or unused deck secured with a twisted rubber band in a backpack or (if fortunate) a briefcase. Or carrying an unused deck and used deck secuted with separate rubber bands.

The deck of 100 punched cards had a paper wrapper (or rather, band) around them, similar to the way a $100 brick of $1 bills has a paper band around them, and for the same reason. The box of cards was exactly that: a box surrounding 500 or 1000 punch cards, unsecured since the usual reason for pulling out a box was so it was easy to grab some to load into a card punch. in fact, if you wanted you could type in a program on a terminal, then have the system punch your program on cards, the only requirement being you had to exchange with the operator as many (new, clean) punch cards as you got in punched cards. You could even dump all your files to tape if you provided one (which the campus bookstore also sold for $7 or $15 depending on length. They also sold 8" floppy disks for use on the minicomputer in the math department.)

Woe to you if you dropped an opened and unsecured deck or opened box of cards, for you now had the privilege of retrieving them off the floor and organizing them to be face up, cut corner to the left. Worse if it were the program you just typed and additionally now had to try to painstakingly put back together (like Humpty-Dumpty).

bryanb wrote: Tue Oct 22, 2019 10:06 pm "This Is How I Hugo. How Do You Hugo?"

The Webby Awards will surely rain down like manna from heaven.

Yeah.right, I'll probably also win a Pulitzer and the Nobel Prize for Literature too.

But anyway, when I started I wanted to explain as a side note why if you looked at the source of an older program that they usually consisted of one (very large) source file and often one (very complicated) main program that did everything. Even if they did sometimes break things off into subroutines. Not for nothing was the term "spaghetti code" invented. This one large main program did everything: initialization, parsing, command dispatch, and command processing.

A lot of things were invented to turn programming from an ad-hoc art into more of an engineering discipline. Structured programming; on-line editing and compilation; modular programming and "one module per file"; splitting modules and main programs into smaller bite-sized (or printed page-sized) pieces; the concept of "make" (that one blew me away when I first saw it). We had to invent (or discover) these things in order to add more formality and structure to our work.

But, they paid dividends, some we didn't even realize until much later. Programs have to be maintained; nobody develops a complex system fully formed (or at least, no one with any sense nowadays tries to.) You start with a small piece that's barely functional. You add pieces to increase functionality. You fix bugs and errors (i differentiate here: bugs are errors that prevent compilation; errors are where the program performs in an incorrect or undesired fashion.) Eventually you get something that people like and use. Then the goddam users want changes. or to quote David Bowie, "Ch-ch-ch-changes!" Lots of changes: new reports; changes to old reports; more functionality; changes to existing functionality; removal of unused or disallowed functionality; changes required for regulatory compliance; and so on.

This is where more formality helps. Structured programming and formal rules don't just make code easier to write, it also makes it easier to maintain. If you're fixing someone else's (or your own "bastard spawn of Satan") code and it's written well, it's easier and more "fun" to maintain it. ("Fun" being a relative term.) Or someone else gets he benefits of your discipline.

Then I realized I'd gotten into a long history lesson into "scarcty-based" programming and the historical artifacts of that era. And maybe I need to end this programming history lesson here.

bryanb · Post by **bryanb** » Wed Oct 23, 2019 4:46 pm

Tdarcos wrote: Wed Oct 23, 2019 4:24 am
bryanb wrote: Tue Oct 22, 2019 10:06 pm I feel like the content in this thread could and should be presented in a series of essays and theoreticals hosted on Reviews From Trotting Krips. Consider this lineup:

"Gettin' Modular Wit It"
Yeah, sure. Either copy things over or PM or e-mail me on how to login and post messages.

Great! I might just do a copy and paste job because the process of posting a new essay to RFTK is a little cumbersome at the moment (my fault). What I SHOULD have done is just let essays be WordPress posts and created an Essays category to sort them. What I did instead is make it so every essay is its own page and at the moment I have to update the essay index page manually. I have no real defense for this behavior, but I just felt like doing things the old way and even now I'm loath to do anything differently. Is there anything you'd like to change or add to your work before it gets essayfied? I'll send you my email address in any case so you'll always know where to send any potential RFTK content. If you think you might want to write some IF reviews for the site, I can create an account for you.

bryanb wrote: Tue Oct 22, 2019 10:06 pm "I Say Punched Cards, You Say Punch Cards: A Very Tomatoey Crowther & Woods Retrospective"

Tdarcos wrote: Wed Oct 23, 2019 4:24 am I think the terminology is a little different because sometimes we used both terms depending on what we had done. At college I would go to the campus bookstore and either buy a deck of 100 punch cards for $1 or $2, or a box of 500 or 1,000 for maybe $5 or $10, I forget. If you could afford it - and remember, most college students had very little money then and the idea of throwing credit cards at students to get them used to the 'debt treadmill' was years in the future - having a box of cards gave you an advantage of an easy place to store unused punch cards or used punched cards. As opposed to carrying a used or unused deck secured with a twisted rubber band in a backpack or (if fortunate) a briefcase. Or carrying an unused deck and used deck secuted with separate rubber bands.

It's very cool that you got to experience the punch card era first hand. I've never had the pleasure/pain -- the first computer I ever used was the IBM XT my dad brought home from work. People always say using punch cards was frustrating, even maddening at times, but I find the tangibility of them very appealing in theory. You're literally holding data in your hand! It's so much less abstract than modern methods of storage which might involve innumerable unseen drives and servers located around the world. Perhaps that's why I still have a soft spot for physical media of all kinds and still prefer local storage to the cloud.

Ice Cream Jonsey · Post by **Ice Cream Jonsey** » Wed Oct 23, 2019 9:09 pm

Tdarcos wrote: Tue Oct 22, 2019 6:31 pm A 4-GB memory rod

This is also how I am referring to computer memory from now on.

Jizaboz · Post by **Jizaboz** » Wed Oct 23, 2019 11:28 pm

Say team four gigabyte RAM-rod!

Tdarcos · Post by **Tdarcos** » Fri Nov 29, 2019 4:28 pm

Jizaboz wrote: Wed Oct 23, 2019 11:28 pm Say team four gigabyte RAM-rod!

And a HARD disk beats a floppy, every time!

Oh god, here I am making disk jokes. I just hope I don't end up in a disk-measuring contest…

Modularization of source code

Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code

Re: Modularization of source code