o Thorsten Pawletta, University of Wismar, Dep. of Mechanical-, Process- and E nvironmental Eng.,


A class library for persistent object management in C++

Author : Jens- Uwe Dolinsky (other projects)
e-mail : u.dolinsky@iname.com

Diese Seite in Deutsch


More detailed information about the project can be found in the journal paper:

Dolinsky J.-U. and Pawletta T.: A lightweight class library for extended persistent object management in C++, Software - Concepts & Tools, Volume 19 Issue 2 (1998) pp 71-79, Springer-Verlag Berlin Heidelberg.


Contains :
    1. Introduction
    2. Features of the class library
    3. Implementation details
    3.1. Class charlist_typ
    3.2. Class OBJECT_list
    3.3. The message parameter structure
    3.4. Class persistent
    3.4.1. Implementation interface : the virtual methode move_data()
    3.4.2. Persistence methods
    3.4.3. Global object generation
    3.4.4. Symbolic constants and macros
    3.4.5. The two interface methods load() und store()
    3.5.Implementation example
    4. Abstract algorithm
    5. User manual
    6. Consideration of subtrees and variable structure
    7. View forward and note
    Download : persis.tar.gz (18 KB)


1. Introduction

The standard language scope of C++ does not offer solutions for the persistent storage of class instances like i.e. Smalltalk.
This documentation describes the implementation and functionality of a new general, efficient persistence mechanism for C++- classes. It has been realised with a generic base class "persistent", which encapsulates all the necessary data structures and methods. Each arbitrary class can inherit these mechanisms for loading and storing of different typed member data elements. Thereby the data elements of an object can be incarnations of scalar or structured data types. Dynamic Data can be processed as well, but some particularities have to be considered which are explained below. It is possible to store respectively reconstruct arbitrarily chained structures, which may include cycles etc.


2. Features of the class library


3. Implementation details

The implementation is realised by three class- and one structure definition:

  1. charlist_typ: private list type
  2. OBJECT_list: private list type
  3. persis_paramtyp: private message parameter
  4. persistent: generic base class


3.1. Class charlist_typ

struct charlist_element
{
    char   *value;
    struct charlist_element *next;
};

class charlist_typ
{
   charlist_element *root;
  public:
   charlist_typ() { root = NULL;}
   charlist_typ();
   void create(char* &s,unsigned l);
   void kill(char* s);
};

The class charlist_typ implements a simple chained linear list of pointer to dynamically allocated memory areas. A list entry contains therefore the pointer and a pointer to the next list entry. The class itself encapsulates a pointer to the first list element. By calling the method create(s,l) a l byte dynamic variable will be created and their address returned via parameter s. Additionally this address will be registered in the inherent list by creating a list element which contains the address. By call of method kill(s) the memory area referenced by s will be released, only if s has been registered in the list.
During the finalisation of a charlist_typ- instance by calling the destructor method, the entire list will be deleted. Thereby all used memory areas for dynamic variables referenced by the list elements will be released.
This feature (Garbage Collection) is used in the generic class persistent , which encapsulates an instance of the class charlist_typ as a member. Using this list the persistence mechanism can recognise, if dynamic variables have been created by it or not. The following class definition illustrates the problem:

class bsp
{
    char* name
    bsp(char* s) {name = s;}
};

An instance of this class initialises its member element <name> by calling the constructor method with a pointer to a string. This String will be saved if the instance will be stored persistently. Before the string can be reloaded (maybe by another application), memory has to be reserved for it. The consequence is that the instance now contains a pointer to a local dynamic variable, which memory area is not administrated by the instance itself. The Deletion of this object would cause a memory leak! The problem also occurs during repeated reinitialisations. If an instance had been reinitialised several times, the memory leaks would occur already after the second reinitialisation.
These problems are avoided by the implemented Garbage Collection.

3.2 class OBJECT_list

struct olist_element /*list element type of OBJECT_list*/
{
&#9;int number; /*position number in OBJECT_list*/
&#9;persistent *object_ptr; /*pointer of persistentobj.*/
&#9;struct olist_element *next; /*pointer of next list element*/
};

class OBJECT_list /*list of read or written data*/
{
&#9;olist_element *list_anchor; /*pointer of first list element*/
&#9;olist_element *current_position; /*pointer of last list element*/
&#9;int quantity; /*number of elements in list */
public:
&#9;OBJECT_list();
&#9;~OBJECT_list();
&#9;void insert(persistent*); /*insert element in list*/
&#9;int number_of(persistent*); /*get position number of element*/
&#9;persistent *object(int); /*get pointer of persistent obj.*/
&#9;void clear_list(); /*delete all list elements*/
};

The class OBJECT_list is used by the persistence mechanism to recognise especially in cyclic, backwards chained or circular structures, which objects have been loaded respectively stored. This class is a tool of the object management of the generic class. The class is a single chained linear listen type, which administrates the references to instances of a persistent derived class. A list entry consists of a pointer to the respective object, one variable for the identification number (ID) assigned by the object list and one pointer to the next list entry. The member data structure of the class consists of two pointers: to the first and the last list entry. By calling the method insert(element) a new list entry for element will be generated, which will get assigned an unique identification number. The method object(number) can be used to get the reference of the object with the identification number number. If this object is not in this list, a Null- pointer will be returned. By calling the method number_of(object) the ID of the object referenced by object is returned. If this object has not been registered in the list, 0 will be returned. Is object a Null- pointer, -1 will be returned. An instance of this class will be initialised with an empty list.

3.3. The message parameter structure

All information the objects need to perform loading or saving procedures is summarised in the structure persis_paramtyp. At the beginning of a persistent action one (!) instance of this structure will be created, respectively initialised and sent as a message parameter to all regarding objects within a object structure. The flag to_disk denotes the mode of the persistence mechanism. The value true stands for saving, false for loading. The element file is the file descriptor of file for the persistent data of the objects (opened for reading respectively writing). The element current_olist is an instance of the list class OBJECT_list (paragraph 3.2). Each involved object registers its address in this list (key word: cyclic structures) and can request information if the respective object has already been stored. The structure persis_paramtyp is the only interface parameter type of the entire persistence mechanism of class persistent (paragraph 3.4) for the user transparent communication between the objects. Before using this parameter the methods save(...) and load(...) must initialise it. Thereby the flag to_disk will be sat respectively, a file must be opened respectively (reading or writing) which address has to be assigned to element file. The list element current_olist initialises itself during the incarnation by calling its constructor with an empty list.

struct persis_paramtyp
{
        BOOL            to_disk;        /* flag to sign if to read or to write*/
        FILE            *file;          /* file descriptor                    */
        OBJECT_list     current_olist;
/* while loading or storing it contains references to*/
/* the stored/loaded Objects*/
};

3.4. The class persistent

To give an arbitrary class the ability to store their instances persistently, in this approach they must be derived from the base class persistent.

class persistent
{
 /*private memberdata and -methods*/
private:        charlist_typ    dynamicstringlist;
 /*garbagecollector: to register all by persistent created strings*/
                persis_paramtyp *persis_para;

        void store_instance(persistent*);  /*stores the parameterinstance*/
        persistent      *get_instance(persistent*,BOOL);
        persistent      *p_create_Instance(char *name);
 /*########  the only Communicationmethod to other persistent- derivates  */
void p_message(persis_paramtyp*,char*); /*store obj. in file*/
public:
     int save (char *,char*,...);  /* Interface- methods*/
     int load (char *,char*,...);

protected:
    persistent(){}
    persistent(persistent *){}
       /*special constructor only used by creating derivatives by
        persistence mechanism*/
    virtual ~persistent(){}

    DEF_GETCLASSNAME_FUNC(persistent)
       /*Macro: creates the p_get_classname()- method, this function
         will be created for derived classes implicitly by macro MOVE_DATA*/

    virtual void move_data(){}
        /*capsulate data methods, for this generic class it's empty */
        /*following methods are only for use in move_data - methods;
    dynamic_object(persistent**,persistent*,BOOL) macro, /*listed here only
                    because of completeness */
    void static_object(persistent &); /*for static persistent derivates*/
    void static_string(char*);
    void data(void *,unsigned);

          /*for universal static datatypes; is used by several macros*/
    void dynamic_string(char* &n,BOOL=true);/* for 0- terminated char* */
    void dynamic_data(char* &,unsigned,BOOL=true);
        /*for dynamic data, notinstances of classes*/
    void ptr_to_const(char *&,BOOL=true);
    void dynamic_obj(persistent **,persistent*, BOOL=true);
    void dynamic_template_obj(persistent **,persistent*,char*, BOOL=true);
        /*next function is for this generic class meanless derivated class
          overwrites this method implicitly by using the Macro MOVE_DATA*/
    virtual void move_memberdata_of(char*)
          { PERSIS_EXCEPTION(ERR_UNKNOWN_CLASS);}
};

As the central class it contains the base methods and data structures for the persistence mechanism which implements the processing of different simple and complex data types. These mechanisms consider static (part of the instance) as well as dynamic member variables. The user interface consists only of the two methods load() and save(). By calling these methods the persistence mechanism will be started for the respective instance. Both methods take care for opening the specified file (for reading respectively writing) as well as for creating of an instance of structure persis_paramtyp, which will be respectively initialised. Finally these methods send this structure as parameter of the method p_ message() to the own object. Thereby its virtual method move_data() will be called which is the implementation interface of the persistence mechanism of class persistent. In derived classes this method will be redefined by sequentially concatenating the calls of the according persistence methods (data(...), static_data(...), etc.) for all member elements respectively. Each persistence method (paragraph 3.4.2) detects by reading the value of the member variable persis_para the mode (saving, loading) of the persistence mechanism and performs the respective procedure. If a member element itself a persistent- derivative, the persistence method (e.g. static_object(...)) sends a message to this object (p_message(...)) which activates the respective procedure in this object recursively.

3.4.1. Implementation interface : the virtual method move_data()

This method "describes" the member data structure of the respective persistent- derived class. Are there no important member data, the method body can be empty or the move_data- method of an eventually inherited class is called. Bt mainly it contains all calls of the persistence methods (paragraph 3.4.2) for all important member data. The order of the calls is not important. If the respective class inherits other persistent derivatives, their member data should not be forgotten. Therefore the move_data()- method of this classes must be called explicitly. These calls can be placed arbitrarily within the method, even between other persistence method calls. Consider!: The head of the move_data()- method will be created implicitly by macro MOVE_DATA (see reference).

3.4.2. Persistence methods

The class persistent supplies for different typed dynamic or static member data persistence methods. These methods send the data messages (in the object oriented sense), which forces them, dependent on the flag persis_para->to_disk to store respectively, load themselves to/from file persis_para->file.

  1. data(void*p,unsigned n) Most elementary method. It writes/loads <n> bytes from/in memory area <p>. It can typically be used for instances of all types except classes.
  2. static_string(char* s) Deals with string <s>. At first the length of the string will be loaded respectively stored followed by the string data.
  3. dynamic_data(char *&,unsigned,bool) Deals with dynamic data types. The first parameter is the pointer to, the second parameter is the size of the object. The 3. parameter is a flag, which informs the mechanism, weather memory will be allocated before loading the data (true LOAD_AFTER_CREATING) or not (false LOAD_WITHOUT_CREATING).
  4. dynamic_string(char*&,bool) Deals with dynamic strings. The task of the flag is the same as for the method dynamic_data(...).
  5. ptr_to_const(char*&,bool) Handles the pointer to dynamic strings (char*),which were initialised with addresses to constants (Problem in paragraph 3.1). The strings loaded by this method are registered by the Garbage- Collection. The flag does the same as in method dynamic_data(...).
  6. static_object(persistent&) Deals with static (instance members !!!) persistent- derivatives. The object (parameter) is sent the message to store/load itself.
  7. dynamic_obj(persistent**,persistent*,bool) Forces the referred persistent- derived instance to store/load itself. The first parameter is a reference to a pointer to the instance. The parameter is a reference to the instance. The reason for this redundant parameter passing is the realisation of type safety during the application of this method. Because the method call is created by a macro (refer to the examples), which casts the first parameter to persistent**, the method could be applied for other member data. That could lead to runtime errors, which are hard to verify. To avoid these errors, the parameter list has been extended. Thus a wrong use of this macro (look section beneath) will be detected already by the compiler. The third parameter does the same as in 4. and 5. .
  8. dynamic_template_obj(persistent**,persistent*,char*,bool) The functionality of dynamic_template_obj(...) has been extended for dealing with templates. Thereby the name of the parameter type must be passed as a string (type char*).

3.4.3. Global object generation

To be able to reconstruct instances of all persistent- derived classes within a project, these classes must be introduced to the persistence mechanism. This is realised by one method, which must be defined once in the entire project (but only if the program deals with dynamic object structures). This method is declared as follows:

 persistent* persistent::p_create_Instance(char *name)

 

It creates an instance of the class, whose name is passed as the parameter string name. Example: Within a project 3 persistent- derivatives (class1, class2, class3) have been defined. Thus, the method p_create_Instance() must be defined as follows:

persistent *persistent::p_create_Instance(char *name)
{ if (strcmp(name,"class1")==0) return new class1((persistent*)NULL);else
  if (strcmp(name,"class2")==0) return new class2((persistent*)NULL);else
  if (strcmp(name,"class3")==0) return new class3((persistent*)NULL);else
   return NULL;
}

Because the inconvenience of this definition, macros have been defined. (see section below). Using these macros the method definition looks as follows:

DEF_CREATE_INSTANCE(REGISTER(class1)
                    REGISTER(class2)
                    REGISTER(class3))

3.4.4. Symbolic constants and macros

For simplicity and better reading of the sources several symbolic constants and macros have been defined.
The following list gives an overview:

Flag values for the persistence methods 4.,5.,7. and 8.

For simplification of the persistence method calls several macros have been defined:

  1. static_data(arg) Value: data((void*)&arg, sizeof(arg)). This macro can be used for all static member data (except class instances).
  2. static_field(arg) Value: data((void*)arg,sizeof(arg)). This macro can be used for different static C- fields (type type field[number]).
  3. dynamic_object(object,create)Value: dynamic_obj((persistent**)&object, object, with_creating) For the pointer object (to a instance of a persistent- derived class) the persistence method call will be generated. The compiler can detect errors caused by using the method for other parameter types.
  4. dynamic_template_object(object,name,create) Creates the call of the persistence method for object ( dynamic_template_obj((persistent**)&object,object,name, create) )

Macros for the global object generation

DEF_CREATE_INSTANCE(parameters) persistent *persistent::
p_create_Instance(char *name)\
{
        parameters\
return NULL;\
}
REGISTER(typ)   if (strcmp(name,#typ)==0)\
                    return new typ((persistent*)NULL); else

The most important macro MOVE_DATA(classname,predecessor) has to be used for all persistent- derived classes It has 2 arguments: classname is the name of the actual class for which the method has to be defined. Predecessor is the name of the inherited class. This macro creates a special constructor, which is only used by the persistence mechanism ( by p_get_classname()- method (implicitly created by Macro DEF_GETCLASSNAME_FUNC)). Furthermore a virtual method move_memberdata_of(char*name) will be created. This method allows the persistence mechanism to deal only with the member data of inherited classes. The argument string will be compared with the actual class name. If both equal, the respective move_data()- method will be called. Otherwise the method of the predecessor will be called and the Comparison starts again.
Finally the MOVE_DATA- Macro generates the method declaration (head) of the move_data()- method. The method body can follow directly as shown in the examples. The macro should help the programmer avoiding formal source editing and it should make the sources better readable.

MOVE_DATA(classname,predecessor) classname(persistent* p):predecessor(p){}\
                                                DEF_GETCLASSNAME_FUNC(classname)\
virtual void move_memberdata_of(char* name)\
{
 if (strcmp(classname::p_get_classname(),name)==0)\
                classname::move_data();\
        else predecessor::move_memberdata_of(name);\
}\
virtual void move_data()

Example:

class klasse1: public persistent
{
 public:
        int     member1;
        float   result;        //two arbitrary member

                MOVE_DATA(klasse1,persistent)
                {
                        persistent::move_data() //treats predecessor
                        static_data(member1);
                        static_data(result);
                }
};

is translated by the pre-processor to:

        class klasse1: public persistent
        {
        public:
                int     member1;
                float   result;

//created by macro !!!
        klasse1(persistent* p):persistent(p){} //special constructor
        virtual char* p_get_classname()       //returns class name
              {return "klasse1";}
        virtual void move_memberdata_of(char* name)
        { if (strcmp(klasse1::get_classname(),name)==0)
                klasse1::move_data();   //if name equals the class name then
                                        //class specific Method klasse1::move_data()
             else persistent::move_memberdata_of(name);
                 //else the same procedure with the predecessor
        }
        virtual void move_data()        //move_data()- head
        {      persistent::move_data()  //deal with predecessor data
               data((void*)&member1,sizeof(member1));
               data((void*)&result ,sizeof(result));
        }
};


3.4.5. The interface methods load() and store()

The persistence mechanism is started by calling one of the two interface methods load() or save() respectively. Both have the same parameter list.

  void persistent::save(char *name,char *classname,...)
  void persistent::load(char *name,char *classname,...)
  // open File for reading
char *name
The name of the file in/from which, the object (structure) has to be stored/loaded. The Parameter is mandatory and mustn't be NULL!
char *classname
The classname of the class, whose member data should be dealt with. In deeper derivation hierarchies it can happen, that only data of a special inherited class are relevant to be stored. In this case the parameter contains the name of this class as a string. If all member data have to be stored, the parameter is NULL (default case).
The ellipse
This ellipse will pass pointer to other objects, which should not be considered by the persistence mechanism. For instance to store substructures which contain pointer to parent objects it is necessary to excludes these objects from the storing/loading procedure. This ellipse can hold several pointers. The list must be completed by a NULL- pointer !!!

All further methods are not accessible from outside and can therefore not be called.

Return values of the interface methods

The return values are the results of the internal exception handling. If the base class has been compiled without the symbolic constant WITH_EXCEPTIONS, the exception handling is switched of. That means that faulty actions of the persistence mechanism do not create useful return values. In the worst case the program can produce runtime errors.

3.5.Implementation example

struct tstruct
{ float b;
  char sname[10];
};

class test:public persistent
{ public : //for simplicity all members are public
 int intvec[10];
 struct tstruct *ptstuct;
 char name[20];
 long ldat;
 test *neighbour;
 char *constname;
 test() { constname="constant name"}
MOVE_DATA(test,persistent)
{ static_field(intvec); //static field
  dynamic_data((char*)ptstruct,sizeof(tstruct),LOAD_AFTER_CREATING);
        //Pointer must be casted to char* according to the method definition.
        //Memory will be allocated before the data will be loaded.
  static_string(name);// An alternative would be static_field
                      // (because name is a static field).
  static_data(ldat);
  dynamic_object(neighbour,LOAD_AFTER_CREATING);
  ptr_to_const(constname);//Because <constname> will be initialised with
                          //a pointer to a string constant.
}
};

4. Abstract algorithm

Storing:

The instance will be notified to store its data fields to file "testfile.persistent".

1. CALL INSTANCE.save("testfile.persistent")
2.      ->Open the file
3.      INSTANCE.para.file = Filehandle
4.      INSTANCE.para.to_disk- Flag == Writemode
5.      CALL INSTANCE.p_message- Method
6.              IF (Instance has already been stored) THEN
7.                      write object number
8.                      RETURN
9.              END IF
10.             -> Insert Address of INSTANCE in object_list
11.             CALL INSTANCE.move_data- method
12.                     FOR (all persistent derivates)
13.                             Write classname
14.                             CALL INSTANCE.derivat.p_message(INSTANCE.para)
15.                     ENDFOR
16.                     FOR (all other memberdata)
17.                             CALL (belonging persistence method)
18.                     ENDFOR
19.     Close file

Loading:

The instance will be notified to reconstruct itself from file "testfile.persistent".

1. CALL INSTANCE.load("testfile.persistent")
2.      ->Open the file
3.      INSTANCE.para.file = File handle
4.      INSTANCE.para.to_disk- Flag == Read mode
5.      CALL INSTANCE.p_message- Method
6.              IF (Instance has already been loaded) THEN
7.                      RETURN
8.              END IF
9.              -> Insert the address of INSTANCE in object list
10.             CALL INSTANCE.move_data- Method
11.                     FOR (all persistent- derivatives)
12.                             -> Create INSTANCE.derivative
13.                             CALL INSTANCE.derivat.p_message(INSTANCE.para)
14.                     ENDFOR
15.                     FOR (all other member data)
16.                             CALL (belonging persistence method)
17.                     ENDFOR

Close file

5. User manual

To supply a class with the persistent feature, in this implementation the following steps have to be performed:

  1. #include "persis.h"
  2. The respective class must inherit class persistent (directly or indirectly).
  3. Redefinition of the virtual method move_data() using the macro MOVE_DATA. This method "contains the necessary structure information" for the persistence mechanism of class persistent.
    If the class inherits the base class indirectly, the move_data()- method must also contain the call of the predecessor::move_data() in order to store/load inherited properties.
  4. Defining of method persistent::p_create_Instance() method with the following macro only once in a project:
        DEF_CREATE_INSTANCE( REGISTER(Klasse1)
                             REGISTER(Klasse2))

Examples:

#include "persis.h"
// The class test1 has only simple data elements
typedef class test1 : public persistent
{
        char    name[10];
        int     a,b;
        float   c;
        struct  {
                int     f;
                char    t;} structure;

        MOVE_DATA(test1,persistent)
        {
                static_data(a);  //arbitrary order of calls
                static_data(b);
                static_data(c);
                static_data(structure);
                static_field(name);

        }

};


Class test2 has simple data and references to partner objects of type test2

typedef class test2 : public test1
{
        int             a,b;
        unsigned        c;
        class test2     *partner;

        MOVE_DATA(test2,test1)
        {      static_data(a);
               static_data(b);
               static_data(c);
               test1::move_data(); // because it inherits test1
               dynamic_object(partner,LOAD_AFTER_CREATING);
        }
};

Class test3 has references to objects of type test2 and test1 .

typedef class test3 : public persistent
{
        test1 *inst1,*inst2;
        test2 *inst3;

        MOVE_DATA(test3,persistent)
        {
                dynamic_object(inst1,LOAD_AFTER_CREATING);
                dynamic_object(inst2,LOAD_AFTER_CREATING);
                dynamic_object(inst3,LOAD_AFTER_CREATING);
        }
        *
        *
};

DEF_CREATE_INSTANCE(    REGISTER(test1)
                        REGISTER(test2)
                        REGISTER(test3))

//Implementation of the persistent::p_create_Instance()- Method

To store an instance only the save(...)- method has to be called

Example:

h->save("pers_obj",NULL,para,NULL);

The instance s stored in file "pers_obj" (parameter para: see section below: Consideration of sub trees and variable structure)

Before the object is loaded, the root object is created with the new- operator. After this the load(...) method will be called:

Example:

h = new test();
h->load("pers_obj",NULL,para,NULL);

To deal only with data elements of an inherited class, their class name must be passed as a string parameter.

h->save("pers_obj","test1",NULL);

An instance of type test2 stores only the member data of it's inherited type test1.

6. Consideration of sub trees and variable structure

The interface methods load(char*,char*,...) and save(char*,char*,...) have as third parameter a persistent* - argument list of variable length (ellipse). For example, if only a sub tree has to be stored, it must be considered, that all references to non-involved parent objects will not be chased by the persistence mechanism.

This can be realised by passing the respective pointer to objects (which should not be stored) as parameter of the interface methods. The argument list must (!!!) be completed by a NULL-Pointer.

These addresses will be registered in die Object_list of the object list, which contains all pointers of already stored objects. Therewith the persistence mechanism is pretended to consider this object as already stored/loaded objects. Thus a sub tree containing any references to other instances, which should not be considered, can be selectively stored.

If such a sub tree has been stored in that way, the tree can be reloaded into another structure or autonomously! In the latter case the last parameter of the load()- method (ellipse) is simply a NULL-pointer.

(See example2.cc)

7. View forward and note

Because standard data types on different platforms could have different formats, the exchange of files created by the persistence mechanism is consistent, if these data has been converted into a suitable exchange format (as ASCII strings). This can be achieved by extending the persistence methods (especially those which are physically writing). One approach is for example to convert all data into strings before they are physically written. From design viewpoint that corresponds to an insertion of a further abstraction layer. Hence there is no need to change existing implementations, which use the library.


Wismar, 2. January 1998
u.dolinsky@iname.com