Sticker Systems Over Monoids

Molecular computing has gained many interests among researchers since Head introduced the first theoretical model for DNA based computation using the splicing operation in 1987. Another model for DNA computing was proposed by using the sticker operation which Adlemanused in his successful experiment for the computation of Hamiltonian paths in a graph: a double stranded DNA sequence is composed by prolonging to the left and to the right a sequence of (single or double) symbols by using given single stranded strings or even more complex dominoes with sticky ends, gluing these ends together with the sticky ends of the current sequence according to a complementarity relation. According to this sticker operation, a language generative mechanism, called a sticker system, can be defined: a set of (incomplete) double-stranded sequences (axioms) and a set of pairs of single or doublestranded complementary sequences are given. The initial sequences are prolonged to the left and to the right by using sequences from the latter set, respectively. The iterations of these prolongations produce “computations” of possibly arbitrary length. These processes stop when a complete double stranded sequence is obtained. Sticker systems will generate only regular languages without restrictions. Additional restrictions can be imposed on the matching pairs of strands to obtain more powerful languages. Several types of sticker systems are shown to have the same power as regular grammars; one type is found to represent all linear languages whereas another one is proved to be able to represent any recursively enumerable language. The main aim of this research is to introduce and study sticker systems over monoids in which with each sticker operation, an element of a monoid is associated and a complete double stranded sequence is considered to be valid if the computation of the associated elements of the monoid produces the neutral element. Moreover, the sticker system over monoids is defined in this study. | Sticker operation| Valence grammar | Valence sticker system | H system | ® 2012 IbnuSina Institute. All rights reserved. http://dx.doi.org/10.11113/mjfas.v8n3.136


INTRODUCTION
The first DNA computation experiment was successfully performed by Adleman in 1994 [1] and he has proved that DNA computing is possible.The sticker system is introduced in [2] as a formal language model for the selfassembly (annealing and ligation operations)phase of Adleman's experiment [1].The basic operation of sticker systems isthe sticking operation [2,3,4,5] that forms double stranded DNA sequences fromblocks of arbitrary shape of DNA single strands.The sticker system model uses no enzymes, employs reusable DNA and has a random access memory that requires no strandextension [5,6].The interest of this research is to extend the DNA sticker systems over some monoids.Some new concepts of DNA sticker systems over monoids are introduced in this paper.

Sticker Operation
A DNA sequence is a double stranded sequence which is composed of adenine (A), thymine (T), cytosine (C) and guanine (G).The bonding between the bases follows the restriction of the base pairing proposed by Watson-Crick in 1953 [7] where A is paired with T and C is paired with G.The basic formalism of sticker operation is that, two single strands of DNA glued together to form double stranded DNA sequence.This gluing operation will continue as long as more single stranded DNA sequence is added in the solution.
Consider the alphabet  (a finite set of symbols) and a symmetric biological relation , where  ⊆  × .Let # be a s pecial symbol not in , denoting an empty space (the blank symbol).The symbol  * denotes the setof all strings of symbol over .It is also considered that the alphabet is ofabstract DNA bases {a, t, c, g} and a relation ⊆  ×  overrepresent thecomplementary relation(a complements to t and c complements to g).The matching base pairs of double stranded molecules of DNA can be represented as�  1  2 �where x 1 is the upper strand and x 2 is the lower strand.Also, x 1 is complement to x 2 .
�, then the concatenation of x and y Using the elements of  ∪ #, the following set of composite symbols is constructed: The Watson-Crick domain is denoted by (), which is associated to the alphabet and the complementary relation .For example, if we have two strings of � ∈ ()is called a w ell-formed double stranded sequence (or simply double stranded sequence) or molecules [4,8].Here,  1 is called the upper strand and  2 is called the lower strand.Note that, the length of  1 and  2 are equal and they are symmetric over the relation , which is (  ,   ) ∈  for  = 1, 2, ⋯ , .By means of symmetry, (  ,   ) ∈ for1, 2, ⋯ ,  is also defined.
The concept of incomplete molecules is used in this paper as an abstraction of DNA molecules with sticky ends [4,6].These molecules consist of mixed DNA molecules with double and single strands.The incomplete molecule is the element of the set: The third type of possible shape (()) can be obtained by extending the first two basic types of elements in ().Figure 1illustrates the possible shapes of the three types of incomplete molecules [4].
There are eight possible shapes of dominoes used in sticking operation which can lead to different cases when stacked to the other.Notice that, in all cases we have a double stranded sequence of 'x' and an overhang sequence of 'y' or 'y and z'.The sticky ends are placed either in the lower strand or in the upper strand.However, for the third type which is (), its element which is x is in and is called well-started double stranded sequence.

Fig. 1 Possible Shapes of Dominoes
The two strands of dominoes in ()might have different length and it is useful to indicate the blank space by the symbol #.A partial operation  stimulates the concatenation and the ligation or annealing operation among the elements of ().The operation is known as sticking operation which is the primary implements of the sticker systems [6].The elements of ()can be prolonged to the left or to the right with a brick or dominoes providing the complementarity between the corresponding sticky ends of dominoes.The process of prolongation will continue until complete double stranded molecule is obtained.
It is easy to see that there are eight possible cases of sticker operation obtained from the elements of ()which is illustrated in Figure 2.

Fig. 2 Eight Possible Cases of Sticker Operation
The sticker operation discussed is a l anguage operation of formal languages that stimulates concatenation operation and annealing operation of the incomplete molecules [6] of ()to the complete double stranded molecule.The sticker operation is the primary implement of the sticker system as discussed in the next section.

Sticker System
The experiment done by Adleman [1] in 1994 for solving Hamiltonian Path Problem (HPP) initiated the study of sticker system by Kari et al. [2].
Sticker system is a language generating device which was first introduced by Kari et al. [2] in 1998 as a formal language model for the self-assembly phase of Adleman's experiment.The term regular sticker system was first used in [2] and was then extended to bidirectional sticker systems in [5].However, both concepts were merged to sticker system as in [4] and the term 'sticker system' will be used in this paper.
When forming new molecules, the initial strands called axioms and a well started sequence are utilized and prolonged either to the left or to the right direction by the process of the operation  [8].Starting from the axiom and iteratively using the operation of sticking, strands are prolonged by using dominoes or bricks in order to obtain a complete double stranded sequence [9].In [4], Paun et al. has defined sticker system in a very general form and the formal definition of sticker system in this paper will follow the definition in [4] and will also make use of the definition given by Kari et al. in [2] and [5].
A sticker system is a construct of 4-tuple  = (, , , ) whereis an alphabet, endowed with the symmetric relation  ⊆  ×  in ,  is a finite subset of axioms (()), and is a finite set of pairs(  ,   ) where   and   are finite subsets of upper and lower stickers of the , respectively.To implement the systems, the mechanism of the sticker system starts from the axiom and uses the pairs of   and   to prolong the strands to the left or to the right according to sticker operation  in order to obtain a set of double stranded sequence in () (a complete molecule).The prolongation process will continue until there is no blank symbol present where a co mplete double stranded sequence is obtained [2,3].
A sequence  1 ⟹  2 ⟹ ⋯ ⟹   , where  1 ∈ , is called a computation in with length  − 1 (notice that, we start from  1 ∈ ).If   ∈ (), where no sticky end and hence blank symbol is present in the last string of composite symbol, the above computation is considered as complete.
A complete computation of sticker system will produce a co mplete string, denoted by  ∈ ().Sticker system will generate the set of all possible languages of the completed strings.The language of all such strings in the sticker operation is the language generated by and is called as sticker language.Sticker language is explained in the next section.

Sticker Language
The sticker language is a language generating device based on the sticker operation which is an abstract model of the annealing operation occurring in DNA computing [2,5].
The language generated by a sticker system consist of all strings formed by the set of upper strands of all complete molecule derived from the axioms for which an exactly matching sequence of lower stickers can be found [5,10].
Sticker system will generate only regular languages (but not all of them) without restrictions [9].Some restrictions have been introduced in [2,4,8,11] on matching pairs of strands to obtained more powerful languages of sticker systems such as primitive sticker language, balance sticker language, and primitive balance sticker language.
The sticker language is the set of all possible languages (from complete molecules) generated by sticker system.Let  = (, , , )be a sticker system, the unrestricted molecular language generated by is defined in [11] as follows: Furthermore, the unrestricted sticker language generated by is the projection onto the first (upper) component of the molecular language, because of the complementarity relation of .The language is denoted as ()and is defined by: () = { ∈ ()|  ⟹ * ,  ∈ }.In this research,SL will denote sticker language.Only complete computation (the set of all upper strands of all complete molecules) as defined above will be considered when defining SL.The similarity of this process to Adleman's experiment can be found in [2].
There are six families of languages generated by sticker systems which are sticker languages, primitive sticker languages, balanced sticker languages, primitive balanced sticker languages, coherent sticker languages, and fair sticker languages denoted by SL, PSL, BSL, PBSL, CSL and FSL respectively.The definitions of these families of languages are well defined in [2].
It is a fact that the families of languages generated by sticker systems are strictly included in the family of regular languages,SL ⊆ REG, PSL ⊆REG, BSL ⊆ REG, PBSL ⊆ REG, CSL ⊆ REG, and FSL ⊆ REG.The proof ofthese lemmas can be found in [2].
Below is an example to illustrate how sticking operation works on a sticker system.

Monoids
In abstract algebra, an algebraic structure consists of one or more sets, called underlying sets or carriers or sorts, closed under one or more operations, satisfying some axioms where monoid is grouped in a group-like structure.
The sticker system is first associated with the properties of monoid, M to see if the operation of sticking preserves the operation of monoid.It is obvious that, the sticker system  satisfies the conditions as a monoid.

Grammar
To study languages mathematically, a mechanism is needed to describe them.In this section, the notion of grammar is introduced as it is considered a co mmon and powerful tool [13] to describe languages produced by the complete computation of sticker systems over monoids.The basic grammars investigated in formal language theory which is Chomsky grammar is discussed and hence it exhibit the relationship between numbers of language families by the Chomsky hierarchy.
The definition of grammar and Chomsky grammar is listed in the following.
Definition 2 [13]:Grammar A grammar G is defined as a quadruple G = (V, T, S, P) where V is a finite set of objects called variables, Tis a finite set of objects called terminal symbols,  ∈  isa special symbol called the start variable, andP is a finite set of production rules.The setsV and T are nonempty and disjoint.

Definition 3[14] Chomsky Grammar A Chomsky grammar is a quadruple G = (N, T, S, P)
where N is a finite set of nonterminal symbols, T is a finite set of terminal symbols,  ∈ is a special symbol called the start symbol, andP is a finite set of production rules.The setsN and T are nonempty and disjoint.The set P is a finite set of pairs (u, v) where ,  ∈ ( ∪ ) * , such that u contains at least one symbol from N. In this paper, the production rules are written in the form of ⟶ where u is an element of( ∪ ) + , and v is in( ∪ ) * .
The set of languages generated by Chomsky grammar, G, is denoted by()and is defined as follows: () = { ∈  * ;  ⟹ * }.A number of language families are encountered which are recursively enumerable language, contextsensitive language, context-free language and regular languages.The languages are generated by regular grammars, context-free grammars, context-sensitive grammars and unrestricted grammars respectively and the grammarsmake up the Chomsky hierarchy, named after Noam Chomsky, a founder of formal language theory [13].The families of finite, regular, linear, context-free, contextsensitive and recursively enumerable languages are denoted by FIN, REG, LIN, CF, CS, and RE respectively.They are said to form the Chomsky hierarchy because of the following inclusion: FIN⊆ REG ⊆ LIN ⊆ CF ⊆CS⊆ RE.
Among the language families which are of interest are the context sensitive language (CS), context-free language (CF), and regular language (REG).The characterization of the regular language concerning the generative capacity of sticker system will be investigated.

Valence Grammar and Extended Valence H System
Grammar with regulated rewriting was first introduced by Paun [14], and is termed as valence grammar.In a valence grammar, each production of a Chomsky grammar is associated with an integer value, and computes the value of a derivation by adding the valences, and the valence of the production is only considered if the valences of the productions used add to zero [15].Definition 4 [14]: Valence Grammar A valence grammar over monoid is a quintuple G = (N, T, S, P, M ) where N, T, S are specified as in context-free grammar.M = (M, ⋄) is a monoid with neutral element, eand binary operation ⋄.The set P is a finite set of pairsr = (P, m) with a context-free rule Pand  ∈ .The language ()generated by G is called valence language and is defined as () = {|  ∈  * , (, ) ⟹ * (, )}.The operation of sticker system is designed by a language generating mechanism so that the valence of the productions used added to zero or one according to additive or multiplicative valence grammar respectively.
Sticker languages produced by the operation of sticking are the possible sets of languages generated by sticker system.In this study, valence sticker language, additive valence sticker language, and multiplicative valence sticker language are introduced with respect to the respective valence grammars.
The set of sticker system with monoid operation is used where for each sticker operation, an element of a monoid is associated and the value of monoid is computed from two used strings to produce a new string.
An extended H system over groups(ℤ, +, 0) and(ℚ + , • , 1) have been introduced in [16] and is known as an extended valence H systems.The definition of extended valence H system has been defined in [16] which discusses the computational power of H systems with valences.In the system, a number is associated with initial strings and the numbers are computed with new produced string.Below is the definition of extended valence H system.
Definition 5 [16]: Extended Valence H Systems Let(,⋄, )be a group with operation ⋄ and the identity e.
An extended valence H system over M is a construct  = (, , , ), where V, T, and R are as in usual extended H system and A is a finite subset of * × .
The valence grammar introduced by Dassow and Paun in 1989 [14] will be used to introduce the concept of sticker system over monoids.The definition of extended valence H system termed by Manca and Paun in [16] is used to define valence sticker system over monoids.

VALENCE STICKER SYSTEM
A new definition of sticker system over monoids which is termed as valence sticker systemis introduced in this research and is given in the following: Definition 6: Valence Sticker System Let (,⋄, )be a monoid with operation ⋄ and the identity e.A valence sticker system over a monoid, M is a construct of valence sticker system, = (, , , , ) where , , are as in the usual definition of sticker system and D is a finite subset of(  × ,   × ).The set of possible languages generated by valence sticker system is called valence sticker language (VSL) and is defined in the following: () = {|  ∈ (), (, ) ⟹ * (, )}.
where ∈ () is a complete double strand, produced by iterative operation of sticking on a starting axiom, A with elements in D. The value of the valence assigned to A must be equal to the identity of monoid, which is e and the computation of sticker systems with the associative elements of the monoids produces the identity element, e.We can define multiplicative valence sticker system () and additive valence sticker system () when the binary operation ⋄ are multiplicative and additive, respectively, namely  1 = (ℚ + , • , 1) and 2 = (ℤ, +, 0).
The idea behind this computation is with each sticker operation, an element of a monoid is associated to each axiom and the valence of monoid operation of the new string from two used strings is computed.A complete double stranded sequence produced is considered to be valid if the computation of the associated elements of the group produces the identity element of the monoids.
In the following, an example of the computation of sticker system with additive and multiplicative monoidsis given.