Estructura de Datos : Strings


Configuracion de la memoria principal de una computadora.

La memoria principal de una computadora es una disposicion de bits organizada por palabras ( words ).

*--------**--------**--------**--------**--------**--------**--------*            *--------*
|........||........||........||........||........||........||........|            |........|
*--------**--------**--------**--------**--------**--------**--------* .......... *--------*

Memoria organizada en palabras de 8 bits



*----------------**----------------**----------------**----------------*        *----------------*
|................||................||................||................|        |................|
*----------------**----------------**----------------**----------------* ...... *----------------*

Memoria organizada en palabras de 16 bits




*------------------------------------------------------------*        *------------------------------------------------------------*
|............................................................|        |............................................................|
*------------------------------------------------------------* ...... *------------------------------------------------------------*

Memoria organizada en palabras de 60 bits
El mecanismo de acceso a la memoria principal de una computadora recupera y almacena en terminos de palabras y el tiempo que tarda es constante e independiente de la posici¢n de la palabra en la memoria.


Juego de caracteres de los lenguajes de programacion.

Los lenguajes de programaci¢n usan caracteres y ellos estan definidos en juegos de caracteres, donde todos los caracteres del juego tienen el mismo ancho, en bits.

*-------**-------**-------**-------**-------**-------**-------**-------*         *-------*
|.......||.......||.......||.......||.......||.......||.......||.......|         |.......|
*-------**-------**-------**-------**-------**-------**-------**-------* ....... *-------*

Juego de caracteres de 7 bits - 2**7 caracteres = 128 caracteres


*--------**--------**--------**--------**--------**--------**--------*            *--------*
|........||........||........||........||........||........||........|            |........|
*--------**--------**--------**--------**--------**--------**--------* .......... *--------*

Juego de caracteres de 8 bits - 2**8 caracteres = 256 caracteres


*----------------**----------------**----------------**----------------*        *----------------*
|................||................||................||................|        |................|
*----------------**----------------**----------------**----------------* ...... *----------------*

Juego de caracteres de 16 bits - 2**16 caracteres = 65536 caracteres
Unicode es un juego de caracteres que usa 16 bits, y pretende que todos los caracteres de los lenguajes de hoy en dia este representados. Hay en uso unos 35000 caracteres. La definicion del juego esta publicada en Internet.


Ejemplos de Strings - Configuraciones

*------------------------------------------------------------*        *------------------------------------------------------------*
|......                                                      |        |......                                                      |
*------------------------------------------------------------* ...... *------------------------------------------------------------*

Un caracter por palabra
*------------------------------------------------------------*        *------------------------------------------------------------*
|............................................................|        |............................................................|
*------------------------------------------------------------* ...... *------------------------------------------------------------*

Diez caracteres por palabra
*--------**--------**--------**--------**--------**--------*            *--------**--------*
|........||........||........||........||........||........|            |........||........|
*--------**--------**--------**--------**--------**--------* .......... *--------**--------*
<--- 1 caracter ---><--- 1 caracter ---><--- 1 caracter --->            <--- 1 caracter --->

Un caracter usa dos palabras


Representaciones.


                  |<------------------- Numero Fijo de Caracteres ( n ) ----------------->|

                  *-----------------------------------------------------------------------*
                  |L|a|s| |E|s|t|r|u|c|t|u|r|a|s| |d|e| |D|a|t|o|s| |s|o|n| |m|o|d|e|l|o|s|
                  *-----------------------------------------------------------------------*
                                            A
                                            |
                  Indice ( 0, n-1 ) --------+          X = Componentes ( Todos Caracteres )


Definiciones

La Estructura de Datos String representa una cadena de caracteres. Un String, una vez creado, es una constante, resultando un tipo de objeto inmutable. String incluye metodos para examinar caracteres individualmente a traves del indice entero no negativo que ordena la cadena. Otros metodos permiten trabajar con subconjuntos de caracteres para comparar, buscar, extraer, trasladar y copiar. Las cadenas de caracteres ( character strings ) tienen una longitud que es un numero entero, mayor o igual a cero. Si T es la cadena de caracteres y n es su longitud, se dice que T esta vacio o es una cadena de caracteres nula si n = 0. Los caracteres que forman la cadena de caracteres provienen de un conjunto de caracteres definidos en el entorno informatico.


Interfases

Interfase Axiomatica

structure STRING

        declare NULL()                            ----> string        // Produce una instancia de String vacio
                ISNULL(string)                    ----> boolean       // Retorno true si el String esta vacio
                IN(string, char )                 ----> string        // Inserta un caracter al final del String
                LEN(string)                       ----> integer       // Retorna la longitud del String
                CONCAT(string, string)            ----> string        // Retorna una nuevo String con los caracteres del primer String seguido de los del segundo
                SUBSTR(string, integer, integer)  ----> string        // Retorna un nuevo String con caracteres consecutivos sacados del String
                INDEX(string, string)             ----> integer       // Retorna la posicion inicial del String dentro del String

        for all S, T perteneciente a string
                  i, j perteneciente a integer
                  c, d perteneciente a char

        let
                ISNULL(NULL)             ====> true
                ISNULL(IN(S,c))          ====> false
                LEN(NULL)                ====> 0
                LEN(IN(S,c))             ====> 1 + LEN(S)
                CONCAT(S,NULL)           ====> S
                CONCAT(S,IN(T,c))        ====> IN(CONCAT(S,T),c)
                SUBSTR(NULL,i,j)         ====> NULL
                SUBSTR(IN(S,c),i,j)      ====> if j = 0 or i + j - 1 > LEN(IN(S,c))
                                                    then NULL
                                                    else if i + j - 1 = LEN(IN(S,c))
                                                                     then IN(SUBSTR(S,i,j-1),c)
                                                                     else SUBSTR(S,i,j)
                INDEX(S,NULL)            ====> LEN(S) + 1
                INDEX(NULL,IN(T,d))      ====> 0
                INDEX(IN(S,c),IN(T,d))   ====> if INDEX(S,IN(T,d)) not = 0
                                                    then INDEX(S,IN(T,d))
                                                    else if c = d and INDEX(S,T) = LEN(S) - LEN(T) + 1
                                                                     then INDEX(S,T)
                                                                     else 0
        end

end STRING


Ejemplos

Siendo S = 'abcd'  se puede representar por IN(IN(IN(IN(NULL,a),b),c),d)

Formas de escribir "abcd" -----> "b"

Siendo SUBSTR(S,2,1) se puede representar por SUBSTR(IN(IN(IN(NULL,a),b),c),2,1)
                                              SUBSTR(IN(IN(IN(NULL,a),b),2,1)
                                              IN(SUBSTR(IN(IN(IN(NULL,a),2,0),b)
                                              IN(NULL,b)
                                              'b'

Formas de escribir "abcd" -----> "cd"

Siendo SUBSTR(S,3,2) se puede representar por IN(SUBSTR(IN(IN(IN(NULL,a),b),c),3,1),d)
                                              IN(IN(SUBSTR(IN(IN(NULL,a),b),3,0),c),d)
                                              IN(IN(NULL,c),d)
                                              'cd'

Interfase Funcional


                  crearString(string)
                  crearString("Buenos Dias Rosario", string)

                  averiguarLongitud(string, longitud)

                  recuperarCaracter(string, posicion, caracter)
                  recuperarSubstringComienzo(string, cantidad, string)
                  recuperarSubstringRango(string, posicionDesde, cantidad, string)
                  recuperarSubstringFinal(string, cantidad, string)

                  buscarCaracterComienzo(string, caracter, posicion)
                  buscarCaracterFinal(string, caracter, posicion)
                  buscarCaracterComienzoRango(string, caracter, posicionDesde, cantidad, posicion)
                  buscarCaracterFinalRango(string, caracter, posicionDesde, cantidad, posicion)
                  buscarSubstringComienzo(string, substring, posicion)
                  buscarSubstringFinal(string, substring, posicion)
                  buscarSubstringComienzoRango(string, substring, posicionDesde, cantidad, posicion)
                  buscarSubstringFinalRango(string, substring, posicionDesde, cantidad, posicion)

                  ejecutarConversionMayusculas(string, string)
                  ejecutarConversionMinusculas(string, string)
                  ejecutarCopia(string, string)
                  ejecutarConcatenacion(string, string, string)
                  ejecutarComparacion(string, string, boolean)
                  ejecutarComparacionIgnorandoMayuscula(string, string, boolean)

Interfase Orientada a Objetos

Constructores ( Constructors )

String()
Initializes a newly created String object so that it represents an empty character sequence.

String(byte[] bytes)
Construct a new String by converting the specified array of bytes using the platform's default character encoding.

String(byte[] bytes, int offset, int length)
Construct a new String by converting the specified subarray of bytes using the platform's default character encoding.

String(byte[] bytes, int offset, int length, String enc)
Construct a new String by converting the specified subarray of bytes using the specified character encoding.

String(byte[] bytes, String enc)
Construct a new String by converting the specified array of bytes using the specified character encoding.

String(char[] value)
Allocates a new String so that it represents the sequence of characters currently contained in the character array argument.

String(char[] value, int offset, int count)
Allocates a new String that contains characters from a subarray of the character array argument.

String(String value)
Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string.

String(StringBuffer buffer)
Allocates a new string that contains the sequence of characters currently contained in the string buffer argument.

Métodos ( Methods )

char charAt(int index)
Returns the character at the specified index.

int compareTo(Object o)
Compares this String to another Object.

int compareTo(String anotherString)
Compares two strings lexicographically.

int compareToIgnoreCase(String str)
Compares two strings lexicographically, ignoring case considerations.

String concat(String str)
Concatenates the specified string to the end of this string.

static String copyValueOf(char[] data)
Returns a String that is equivalent to the specified character array.

static String copyValueOf(char[] data, int offset, int count)
Returns a String that is equivalent to the specified character array.

boolean endsWith(String suffix)
Tests if this string ends with the specified suffix.

boolean equals(Object anObject)
Compares this string to the specified object.

boolean equalsIgnoreCase(String anotherString)
Compares this String to another String, ignoring case considerations.

byte[] getBytes()
Convert this String into bytes according to the platform's default character encoding, storing the result into a new byte array.

byte[] getBytes(String enc)
Convert this String into bytes according to the specified character encoding, storing the result into a new byte array.

void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
Copies characters from this string into the destination character array.

int hashCode()
Returns a hashcode for this string. int indexOf(int ch)
Returns the index within this string of the first occurrence of the specified character.

int indexOf(int ch, int fromIndex)
Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.

int indexOf(String str)
Returns the index within this string of the first occurrence of the specified substring.

int indexOf(String str, int fromIndex)
Returns the index within this string of the first occurrence of the specified substring, starting at the specified index.

String intern()
Returns a canonical representation for the string object.

int lastIndexOf(int ch)
Returns the index within this string of the last occurrence of the specified character.

int lastIndexOf(int ch, int fromIndex)
Returns the index within this string of the last occurrence of the specified character, searching backward starting at the specified index.

int lastIndexOf(String str)
Returns the index within this string of the rightmost occurrence of the specified substring.

int lastIndexOf(String str, int fromIndex)
Returns the index within this string of the last occurrence of the specified substring.

int length()
Returns the length of this string.

boolean regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len)
Tests if two string regions are equal.

boolean regionMatches(int toffset, String other, int ooffset, int len)
Tests if two string regions are equal.

String replace(char oldChar, char newChar)
Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar.

boolean startsWith(String prefix)
Tests if this string starts with the specified prefix.

boolean startsWith(String prefix, int toffset)
Tests if this string starts with the specified prefix beginning a specified index.

String substring(int beginIndex)
Returns a new string that is a substring of this string.

String substring(int beginIndex, int endIndex)
Returns a new string that is a substring of this string.

char[] toCharArray()
Converts this string to a new character array.

String toLowerCase()
Converts all of the characters in this String to lower case using the rules of the default locale, which is returned by Locale.getDefault.

String toLowerCase(Locale locale)
Converts all of the characters in this String to lower case using the rules of the given Locale.

String toString()
This object (which is already a string!) is itself returned.

String toUpperCase()
Converts all of the characters in this String to upper case using the rules of the default locale, which is returned by Locale.getDefault.

String toUpperCase(Locale locale)
Converts all of the characters in this String to upper case using the rules of the given locale.

String trim()
Removes white space from both ends of this string.

static String valueOf(boolean b)
Returns the string representation of the boolean argument.

static String valueOf(char c)
Returns the string representation of the char argument.

static String valueOf(char[] data)
Returns the string representation of the char array argument.

static String valueOf(char[] data, int offset, int count)
Returns the string representation of a specific subarray of the char array argument.

static String valueOf(double d)
Returns the string representation of the double argument.

static String valueOf(float f)
Returns the string representation of the float argument.

static String valueOf(int i)
Returns the string representation of the int argument.

static String valueOf(long l)
Returns the string representation of the long argument.

static String valueOf(Object obj)
Returns the string representation of the Object argument.