Thursday, 31 May 2012

Relocation and Linking concepts


Program relocation : in program instruction and data use some address. Program assumes its instructions and data to occupy memory words with specific address.
Such a program is called an address sensitive program. It contains one or more of the following:
1.An address sensitive instruction : an instruction which uses an address.
2.An address constant : a data word which contains an address.
Definition : program relocation is the process of modifying the address used in the address sensitive instructions of a program such that the program can execute correctly from the designated area of memory.
In complete process translation origin and linking are always different. but linking origin and loading origin may be same or different.
If loading origin and linking origin are different this means relocation must be performed by the loader.
Linker always performs relocation. But some loaders do not do this.
If loaders do not perform relocation this means load origin and link origin are equal, and this loader is called absolute loader.
If loaders perform relocation then loaders are called relocation loaders.
Performing relocation : we can calculate the relocation factor by linking origin and translate origin. Suppose translated origin and linked origin of program are t_origin and l_origin respectively. Consider a symbol  symb  in program P. its translation time address be tsymb and link time address be lsymb. The relocation factor of program is defined as
  relocation_factor = l_orgin – t_origin
Relocation factor can be positive, negative or 0.
Here symb  is working as an operand. The translator puts the address tsymb  in the instruction for it.
                     Tsymbt_origin + dsymb    
where dsymb is the offset of symb in program.
lsymb = l_origin + dsymb
lsymb = t_origin +relocation_faction +dsymb
       = t_origin + dsymb + relocation_factior
       = tsymb + relocation_factor

Linker


A Linker is a program that is used to properly combine all the object program files of the s/w and to convert them into the final executable, which is called load module.
It means that a linker takes object program files and fits them together to assemble them into the program final executable form.
A module approach is used to develop reasonable sized software. In this approach the s/w is divided into a no. of smaller program and module.
It is easier to develop, test and debug smaller program.
Steps involved in execution of a program 
1.Translation of the program.
2.Linking of the program with other programs needed for its execution.
3.Relocation of the program to execute from the specific memory area allocated to it.
4.Loading of the program in the memory for the purpose of execution. 
 
Translated, linked and load time addresses
While compiling a program, a translator is given an origin specification for program. This is called the translated origin of program.
The program can specify the origin in a START or ORIGIN statement.
The translator uses the value of the translated origin to perform memory allocation for the variable declared in program.
The execution start address or simple the start address of a program is the address of the instruction from which its execution must begin.
The start address specified by the translator is the translated start address of the program.
The origin of a program may have to be changed by the linker or loader by one of two reasons.
1.The same set of translated address may have been used in different object modules constituting a program.
2.An operating system may require that a program should execute from a specific area of memory. This may require a change in its origin. The change of origin leads to changes in the execution start address and in the address assigned to symbols.
Following terminology is used to refer to the address of a program entity at different times :
1.Translation time address : address assigned by the translator.
2.Linked address : address assigned by the linker.
3.Load time or load address : address assigned by the loader.
The same prefixes translation time, linked and load time are used with the origin and execution start address of a program. Thus
1.Translated origin : address of the origin assumed by the translator.
2.Linked origin : address of the origin assigned by the linker while producing a binary program.
3.Load origin : address of the origin assigned by the loader while loading the program execution.
The linked and load origins may differ from the translated origin of a program.

Overview of interpreter


The interpreter consists of three main components:
1.Symbol Table : the symbol holds information concerning entities in the source program.
2.Data Store : the data store contains values of the data items declared in the program being interpreted. The data store consists of a set of components. A components is an array containing elements of a distinct type.
3.Data manipulation routines : a set of data manipulation routine exist. This set contains a routine for every legal data manipulation action in the source language.

Interpreter

This summary is not available. Please click here to view the post.

PARAMETER PASSING MECHANISM


1.CALL BY VALUE:- In this, values of actual parameters are passed to the called function. these values are assigned to the corresponding formal parameters. A called function may allocate memory to a formal parameter & copy the value of actual parameter into this location at every call  .
1.CALL BY REFERENCE:- In this, the address of actual parameters is passed to the called function. If the parameter is an expression ,its value is computed & stored in a temporary location & the address of temporary location is passed to the called function. If the parameter is an array element, its address is similarly computed at the time of call.
 

Aspect of Compilation


A compiler bridges the semantic gap between a PL(programming language) domain and an execution domain.

There are 2 aspects of compiler
1.
Generate code to implement meaning of a source program in the execution domain.
2.
Provide problem solving approach for harms of PL semantics in a source program.
     There are so many features which contribute to the semantic gap between a PL domain and an execution domain.

Data Types

Data structures

Scope rules

Control structure
1.
Data Type : it is the specification of legal values for variables of the type and legal operations on the legal values of the type. Semantics of datatype require a compiler to ensure that variables of a type are assigned or manipulated only through legal operations.
  
The following tasks are involved in ensuring this:

Checking legality of an operation for the types of its operands. This ensures that a variable is subjected only to legal of its type.

Use type conversion operations to convert values of one type into values of another type wherever necessary and permissible according to the rules of a PL.

Use appropriate instruction sequence of the target machine to implement the operations of a type. Ex :
  
Float x,y;
  Int
i=8,j
  
Y=5;
    X=y + i;

Ex:
    
int i;
    float
a,b;
    a=
b+I;


Assembly code:
    
CONV_R   AREG, I
    
ADD_R     AREG, B
    
MOVEM   AREG, A

Data Sturcture :
   
array , structure

Scope rules :
  
{
  
float x,y;
  
{
  
int a,b;
  b=
x+y
  
}
}

Control Structure :
   
if condition, switch….case, loop
Memory allocation

Memory allocation involves three important tasks:
1.
Determine the amount of memory required to represent the value of a data item.
2.
Use an appropriate memory allocation model to implement the lifetimes and scopes of data items.
3.
Determine appropriate memory mappings to access the values in a non scalar data item, ex : array.
Memory Binding

A memory binding is an association between the memory address attribute of a data item and the address of a memory area.
1.
Static memory allocation
2.
Dynamic automatic memory allocation
3.
Program controlled allocation
Extended Stack

A stack is a linear data structure. Which follow the concept of LIFO.

In data structure an area of memory is reserved for the stack.

A pointer called the stack base (SB) points to the first word of the stack .

Last element of the stack is called TOP OF Stack (TOS).

To SB and TOS 2 new pointers exits in the model.
1.
A record base pointer(RB) pointing to the first word of the last record in stack.
2.
The first word of each record is a reserved pointer.

Memory allocation & access

Automatic dynamic allocation is implemented using the extended stack model.

SB(stack base)

TOS 

RB(record base pointer)

Reserve pointer

Each record in the stack has 2 reserved pointers instead of one. Each record accommodates the variables for one activation of a block, hence we call it an activation record (AR). During the execution of a block structured program , a register called the activation record base(ARB)  always points to the start of the TOS record. This record belongs to the block which contains the statement being executed.

The first reserved pointer in a block AR points to the activation record of its dynamic parent. This called Dynamic pointer. The dynamic pointer is used for deallocating an AR.

Access to nonlocal variables is implemented using the second pointer called Static pointer. When AR is created for block its static pointer is set to point to the AR of static ancestor of the block.

Symbol table requirements : In order to implement dynamic allocation and access, a compiler should perform :
1.
Determine the static nesting level of  current block.
2.
Determine the variable designated by the name. this should be done in accordance with the scope rules.
3.
Determine the static nesting level of the block in which variables are defined.
4.
Generate the code.
A simple scheme to implement function 1,2 and 3.
  using extended stack model to organize symbol table.
When the start of block(current block) is encountered during compilation, a new record is pushed on the stack. It contains
1.
Nesting level of current block.
2.
Symbol table for current block.
The reserved pointer of the new record points to the previous record in the stack.
Compilation of expression

In source code we use so many types of expression, in code generation for expression following task are performed :
1.
Determination of an evaluation order for the operators in an expression.
2.
Selection of instruction to be used in the target code.
3.
Use of register and handling of partial results.
     The expression is evaluated according to precedence rule of operators. 

The choice of an instruction to be used in the target code depends on the following :
1.
The type and length of each operand.
2.
The addressability of each operand, where the operand is located and how it can be accessed.

The addressability of an operands indicates where an operand is located and how it can be accessed.

An operand descriptor to maintain the type, length and addressability information for each operand. The choice of an instruction can be made by analyzing an operator and the descriptors of its operands.

A partial result is the value of some sub expression computed while evaluating an expression.

Partial results are maintained in CPU registers. Some of them have to be moved to memory if no of results exceeds the no of available CPU registers. An important issue in code generation is when and how to move partial result is contained in a register. We use a register descriptor to maintain information.

Operand descriptors :
    
An operand descriptor has the following fields :
1.
Attributes : contains the subfields type, length and miscellaneous information.
2.
Addressability : specifies where the operand is located and how it can be accessed. It has 2 subfields
A.
Addressability code : Takes the values ‘M’ (operand is in memory) and ‘R’ (operand is in register).
B.
Address : Address of a CPU register or memory word.

Register Descriptor

A register descriptor has 2 fields
1.
Status : Contains the code free or occupied to indicate register status.
2.
Operand descriptor #: if status= occupied, this field contains the descriptor # for the operand contined in the register.

Register descriptions are stored in an array called Register descriptor. One register descriptor exists for each CPU register. 









The Compilation process


The objective of the compiler is to transform a program written a high-level programming language from source code into object . Programmers write in a high-level programming languages from source code into object code. Programmers write program in a form called source code. Source code must go through several steps before it becomes an executable program.
The first step is to pass the source code through a compiler, which translates the high-level languages instruction into object code. At last it producing an executable code. After produced object code , it is pass to the linker. The linker combines modules and gives real values to all symbolic addresses.

Every high-level programming language comes with a compiler. The compiler the language, because it defines which instructions are acceptable.
Compilers translates source code into object code, which is unique for each type of computer, many compilers are available for the same language.
General models of compilers(Phases of C)
1.Lexical analysis : recognition of basic elements and creation of uniform symbols.
2.Syntax analysis : recognition of basic syntactic constructs through reductions.
3.Interpretation (Symantec analysis) : definition of exact meaning, creation of matrix and tables by action routines.
4.Machine independent : creation of more optimal matrix.
5.Storage assignment : modification of identifier and literal tables. It makes entries in the matrix that allow code generation to create code that allocates dynamic storage, and that also allow the assembly phase to reserve the proper amounts of static.
6.Code generation : use of macro processor to produce more optimal assembly code.
7.Assembly and output : resolving symbolic addresses and generating language.
Compiler also manage the database, in the database it manages
A.Source code
B.Uniform symbol
C.Terminal table
D.Identifier table
E.Literal table
F.Reductions
G.Matrix
H.Code productions
I.Assembly code
J.Relocatable object code
1.Source code : program
2.Uniform symbol table : Consist of a full or partial list of the token as they appear in the program. Created by lexical analysis and used for syntax analysis and Symantec analysis.
3.Terminal table :  a permanent table which lists all key words and special symbols of the language in symbolic form.
4.Identifier table : it contains all variables in the program and temporary storage and any information needed to reference or allocate storage for them.
5.Literal table : it contains all constants in the program creation and use similar to the identifier table.
6.Reductions : permanent table of decision rules in the form of patterns for matching with the uniform symbol table.
7.Matrix : intermediate table of decision rules in the form of pattern for matching with the uniform symbol table.
8.Code productions : permanent table of definitions. There is one entry defining code for each possible matrix operator.
9.Assembly code : assembly language version of the program which is created by the code generation phase.
10.Reloadable object code : final output of the assembly phase, ready to be used as input to loader.