The evolution of abstractions

This article can be seen as an introduction to the topic “Software Development” without specifying which language, although the article is based on the evolution of a specific programming language. In form, the article resembles a glossary, which is arranged in chronological order.

Intro

Studying the development of programming languages allows you to see the evolutionary tree of this process and understand modern complex concepts. During the retrospective review, definitions of basic terms will be given, most of which are relevant to this day.

Program – record of an algorithm, the execution of which leads to the achievement of a result.

If the result is expected, then the program does not contain errors. On the other hand, there are two postulates.

Every program contains at least one bug.
Any program can be shortened by at least one command.

Of course, this is a joke, but, like any joke, there is some truth in it.

Programming – finding and implementing an algorithm in a programming language.

Algorithms are formulated in a natural language or, at best, in an algorithmic one. And somewhere in the process of translating from one language to another, errors and inaccuracies occur. From this we can conclude that

An error in a program is an inaccuracy in the implementation of an algorithm in a programming language.

Based on these definitions, we can conclude that the closer a programming language is to an algorithmic language, the less likely it is to make mistakes in the translation process. In this regard, the question may immediately arise: is it not possible to immediately use an algorithmic language for writing programs? Then there will be no problems with the translation. Unfortunately, the answer will be no. And this is due, in my opinion, to the fact that at the beginning of the development of information technology, programming languages were “adjusted” to the capabilities of the hardware, and the capabilities of translators (compilers) were very limited. At the moment, the gap between the technologies for the production of processors (for which, in fact, programs are written) and the technologies for the development of the mathematical and logical apparatus underlying any algorithmic languages, is so great that even talking about neural networks and artificial intelligence cannot help overcome this gap. in foreseeable future. So let’s make the most of what we have.

Programming language Pascal (Pascal) is as close as possible to an algorithmic language. And its modern successor Delphi allows you to implement all the concepts that have appeared during the evolution of programming languages. Of the inconveniences, I will only note the fact that the keywords of the language are English words without the possibility of redefining them. But for English-speaking developers, it is as close as possible to their natural language. And, although the Pascal programming language was created in 1970 by Professor Niklaus Wirth as an academic programming language for teaching students, it is still relevant.

There are concepts that do not depend on a particular programming language, but underlie the understanding of how they work. Let’s consider them in more detail in the aspect of the development of abstractions used in software development.

Zero Abstraction Level

At the zero level of abstraction, programmers operated with concepts from the physical world: processor instruction codes, processor registers, memory cells, input-output ports, and so on. The first programming languages (low-level programming languages) allowed using their mnemonic names instead of machine codes, and text labels instead of memory addresses. And although this gave complete control over the performance of the code, the speed of the program development itself was very slow. What did you have to deal with then?

Program code

Program — a sequence of instructions executed by a computer.

Subroutine — a part of a program that performs certain actions.

From the point of view of the algorithm, a subroutine is a concept that allows you to first of all get rid of repeating fragments, and secondly – to scale the algorithm, dividing it into small blocks that are available for holistic perception. I emphasize the importance of both points. The first aspect allows you to save the memory in which the program code is stored, and also makes it easier to make changes to the program. The second saves the time needed to understand the algorithm (or program), and also allows you to implement very complex systems by breaking them into small fragments – subroutines.

At a low level, to call a subroutine, the CALL <address> instruction is used, which first writes the address of the instruction following it to the call stack (a special area of computer memory), and then transfers control to the specified address. To return from a subroutine, use the RET instruction, which retrieves the jump address from the call stack.

The concept of “parameter” in low-level languages is conditional, and is determined not by the semantics of the language, but logically – parameters are passed through processor registers or through the data stack. At the same time, subroutines could “return” several parameters as the results of their work.

Data

Variable — a named area of memory. The value of a variable is the contents of the specified memory area. While the program is running, the value of the variable may change.

What exactly is in this area (integer, text or array element) was determined by the logic of the program, and had no control from the programming language, which often led to errors and made it difficult to write safe code.

Constant — named area of memory. The value of a constant is the contents of the specified memory area. The value of the constant does not change while the program is running.

As you can see, there is little difference between variables and constants, and it is up to the compiler to control the immutability of constants.

There were no expressions at the zero abstraction level – the commands, if necessary, included the data to be processed.

First Abstraction Level

Commands have been replaced by operators – abstract instructions that are not related to the codes executed by the processor.

Operator — the smallest structural unit of a programming language.

Operators can be divided into imperative (performing actions) and declarative (declaring various entities of the language: variables, constants, procedures, etc.)

Keyword — a string identifier that has a special meaning.

A statement can include one or more keywords. Keywords cannot be used as names for variables, constants and functions.

Data types

Data type (type) — set of values and operations on these values.

Types can be divided into simple types and compound types (sets of elements). Simple types are divided into standard and custom types. Custom types are enumerated and ranged. Composite types are homogeneous (arrays) and heterogeneous (records).

An enumerated type — a data type whose set of values is a limited list of identifiers.

Multiple type — a data type, which is a set of elements of an enumerated type.

Interval type — a data type whose set of values represents the range of values of the standard type.

Array — a collection of elements of the same type with a finite size. Array elements are accessed by index.

Record — a structured combined data type consisting of a fixed number of components (fields) of different types.

Typing

There are some connections (dependencies) between the type, variable and its value that determine the logic of working with these entities in the context of a specific high-level programming language.

Typing — the way a programming language defines a data type. Typing has three aspects.

Static / dynamic typing. Static typing is determined by the fact that final types are set at compile time. In dynamic typing, all types are determined at runtime
Strong / weak typing (also sometimes called strict / loose). Strong typing is distinguished by the fact that the language does not allow mixing different types in expressions and does not perform automatic implicit conversions, for example, you cannot subtract a set from a string. Weakly typed languages perform many implicit conversions automatically, even if loss of precision or conversion may be ambiguous
Explicit / implicit typing. Explicitly typed languages differ in that the type of new variables (as well as functions and their arguments, which will be discussed below) must be explicitly specified. Accordingly, languages with implicit typing shift this task to the compiler / interpreter.

Expressions

Expression — a sequence of operands and operators that evaluates to a value of a particular type.

Operand — an entity with which operations are performed: variables, constants, functions, expressions.

Operations — actions that are performed on operands. Operations can be binary (performed on two operands) and unary. Operations differ in types (mathematical, logical, string) and define the data type (see Data type above).

If the type of the variable and the type of the expression do not match, then a type cast is required.

Type casting — the conversion of a value of one type to a value of another type.

Type casting can be explicit (specified by the programmer) and implicit (regulated by the rules of the programming language casting).

It should be noted that Pascal uses static strong explicit typing, but it also has a special Variant type that allows for dynamic typing. It also performs automatic casting of some types and allows you to perform function overloading (create functions with the same name, but with different types of results), which can be regarded as weakening the typing.

Functions

With the advent of the first level of abstraction in programming languages, subroutines have parameters (formal and actual), and subroutines themselves are divided into procedures and functions.

Function — a subroutine that returns a result (value) of a certain type.

Functions are usually used in expressions and their properties resemble variables (functions have a name and a type), the value of which is calculated at the moment the function is called. Therefore, everything that has been said about typing applies to functions as well.

Procedure — a function that does not return a result.

Parameter — an argument of a certain type accepted by the function.

Formal parameter — an argument specified when declaring a function.

Actual parameter — the argument passed to the function when it is called.

To call a function, its name is used, and the return occurs automatically upon its completion or when using a special exit statement from the subroutine. When calling a procedure, the number and types of actual parameters must match the number and types of formal parameters, and the return result of the function is only one. However, both of these limitations can be circumvented. In the first case, you can add a default parameter value, and in the second, you can use parameters – variables whose value will be returned to the place where the function or procedure was called.

There is also a special procedural type. Using this type, you can write code in which the names of called functions and procedures can be changed programmatically, set as a parameter.

Modules

With the growth of complexity and the increase in the volume of program source codes, it became necessary to divide the code into modules.

A module — a small block of program code that contains functions, variables and constants necessary to solve problems of a certain type.

A module has a name (a string identifier unique within the application) that can be used as a prefix when writing a function call statement. This may be necessary if different project modules contain functions with the same name

Constants

We are talking about declaring constants, that is, constants with a string identifier that are declared before they are used in expressions. Constants can be divided into several types.

Pure constants – constants whose declaration contains only an identifier and a value. The type of such a constant is determined by the type of the value. The value of a constant can be given by a constant expression.

Constant expression — an expression whose value the compiler can infer without executing the program in which it is included. Constant expressions include numbers, character strings, pure constants, enumerated values, the special constants True, False, and nil, and expressions built from these elements using operators, type conversions, and set constructors.

Constant expressions cannot include variables, pointers, and function calls, except for: Abs(), High(), Low(), Pred(), Succ(), Chr(), Length(), Odd(), Round(), Swap(), Hi(), Lo(), Ord(), SizeOf(), Trunc() (Built-in Pascal functions are shown)

Typed constants — constants whose declaration contains an explicit indication of its type.

A special case of typed constants are constant arrays, the value of which is set in a special way. Such arrays can be multidimensional.

You can also declare record constants

There are so-called procedural constants whose purpose is to provide an alternative name for a function.

Scope

The scope (accessibility) of variables, constants, procedures and functions depends on the place where they were declared. If they are declared inside a procedure or function, then the scope is limited to the given procedure and function. If they are declared in the header of the module in the implementation section (implementation), then the scope is expanded to the size of the module. If they are declared in the header of a module in the interface section (interface), then the scope is extended to all modules that refer to that module in the usage section (uses).

Second Abstraction Level

A data type has evolved into a class, a function into a method, a field into a property, a variable into an object. But the old concepts did not disappear, they continued to exist along with the new ones.

Class

Class is data (fields and properties) and methods for processing them.

Encapsulation — combining data together with methods of processing it and hiding information about how this processing is carried out.

Inheritance — creating a new class based on another (parent) class while maintaining access to data and methods of the parent class.

A class hierarchy is a hierarchical set of classes with which a program is created. At the top of the class tree is the TObject class.

Polymorphism — the ability for objects with the same specification to have different implementations. In particular, the ability of functions to process data of various types.

Method

Methods — procedures and functions that belong to a given class.

In addition to the usual methods and functions, the class contains a constructor and destructor that are used to create

Constructor — a function that creates an instance of the given class and returns it as a result.

Destructor — a procedure that destroys an object and frees the memory it occupies.

Methods can be declared with various attributes that define their scope and functionality.

Virtual method — the implementation of the method in the descendant class will be completely copied from the implementation of the method of the ancestor class.

Dynamic (dynamic) method — the implementation of the method refers to the implementation of the method of the ancestor class.

Virtual methods are used when high performance is required (frequent calls, methods overlap), dynamic methods save memory (rare calls, many descendants without method overlap).

Override method — a method inherited from a parent class, the implementation of which in the derived class has been changed. You can override a virtual or dynamic method.

Overload methods — methods with the same name but different parameters. Used to implement polymorphism.

A static method — a method that can be called without instantiating an object.

Static methods are used to create library functions, grouping them into a class for convenience.

An abstract method — a method that does not contain an implementation in the class in which it is declared.

The implementation of an abstract method must be in the derived class. Attempting to call an abstract method while the program is running will throw an Abstract error.

Message handling

With the advent of the Windows operating system, an event-driven approach to programming began to be used. It is based on the fact that the program has special procedures – event handlers.

Event — these are actions performed by the user or the operating system, as well as the consequences of these actions – events that occur within software components. When an event occurs, a message is generated – a sequential-hierarchical call of message handlers.

Message handler — a procedure designed to process messages from the operating system and other program components.

Event handler — a procedure designed to handle an event of a certain type, which is called when this event occurs. This mechanism allows you to extend the functionality of classes by creating additional logic for the class.

Property

Property — a class attribute whose value is accessed via methods.

The property hides the fields of the class that normally store the object’s data. Moreover, since data is accessed through methods, there may be no fields at all.

Scope

The scope of a class depends on whether it is declared in the interface section or in the implementation section.

Sometimes it becomes necessary to declare public classes in the implementation section in order to avoid circular uses references in the interface section. Then the class forward declaration is used.

Different sections inside the class description are used to set different scopes of class members (methods and properties).

Private. The scope of class members is limited to the class itself.

Protected. Class members declared in this section are visible within the class itself and its descendants.

Public. Members of a class are available to anyone who has access to the class. This is the default scope, meaning that if you don’t specify a section, the properties and methods are assumed to be public.

Published. This is a special section for properties and event handlers that must be published in the property editor of the design environment component. Putting properties, methods, and fields in a published section will cause the compiler to generate meta-information (RTTI) for them, allowing the object to “learn about itself” at run time.

Automated. This section contains properties that are available to everyone. Used in TAutoObject class descendants when creating OLE Automation servers.

Friend classes are classes located in the same module. Friend classes can have access to all members of classes located in the same module. If it is required to restrict “friendliness”, then the strict keyword is added to the declaration of the private and protected sections.

Object

Object — an instance of a class created by its constructor method.

Third Abstraction Level

The class has evolved into an interface.

Interface

Interface — a class that cannot contain fields, nor does it have a constructor or destructor. All methods are abstract, and access to properties is strictly through interface methods.

All interface functionality is implemented in a child class. A class can inherit only one parent class, but multiple interfaces can be inherited.

Historically, interfaces were needed to implement OLE mechanisms, but later they began to be used for many purposes and tasks.

OLE (Object Linking and Embedding) is a technology for linking and embedding objects into other objects developed by Microsoft. OLE allows you to transfer part of the work from one program to another and return the results back.

An interface allows you to describe some desirable properties that entities can have without revealing their internal structure.

What’s next?

There is no definite answer to this question, since all significant discoveries and developments in the field of programming languages were made in the last century. New languages appear, but they only play with some of the nuances of the implementation of the concepts outlined above, simplify writing code by shortening expressions, thereby complicating code understanding.

An important role in this process is played by the war of standards – part of the corporate wars and brand confrontations.

There is a contradiction: it is easiest to express goals and ways to achieve them in a natural language, but all natural languages do not have the completeness of formal logic, they are constantly changing and not always for the better. Specialized languages allow you to formulate algorithms precisely, but they are difficult to learn and remember, carry the features of the subject area, or are not expressive enough to solve general problems

There is a lot of talk about artificial intelligence and self-learning systems that do not require programming. But this is a myth that, on the one hand, serves to give pathos to very prosaic tasks that developers of such systems solve, and on the other hand, gives rise to dreams of a world where “robots work hard, not humans.”

But, perhaps, you have other information on that account, write in the comments – we will discuss it.

The evolution of abstractions

Intro

Zero Abstraction Level

Program code

Data

First Abstraction Level

Data types

Typing

Expressions

Functions

Modules

Constants

Scope

Second Abstraction Level

Class

Method

Message handling

Property

Scope

Object

Third Abstraction Level

Interface

What’s next?

➥

Correction of mistakes

Setting Column Width

LAST BETA

Leave a Reply Cancel reply

Subscribe to be notified of new posts.

Archives