Unicode is very important as it is a standard that allows computers to represent and manipulate, consistently, the text of any existing system of writing.
Here I have presented the source code in C where the non ascii character can be displayed properly in the console. To display the non ascii character , we need to implement unicode encoding/decoding process. There can be variouus ways to Implement unicode in C, However here I have embedded Python to implement unicode.
To embeed python in C, we need to add Python.h header in the program. we need the
more detail on embedding python on c is described here
source code
#include <glib.h>
#include <Python/Python.h>
Here I have presented the source code in C where the non ascii character can be displayed properly in the console. To display the non ascii character , we need to implement unicode encoding/decoding process. There can be variouus ways to Implement unicode in C, However here I have embedded Python to implement unicode.
To embeed python in C, we need to add Python.h header in the program. we need the
python-dev package which contains Python.h to be installed prior to running this program.more detail on embedding python on c is described here
source code
#include <glib.h>
#include <Python/Python.h>
char *get_encoded_msg(char *buffer, char *charset)
{
Py_ssize_t ssize = (Py_ssize_t)strlen(buffer);
PyObject *pyobject_unicode= PyUnicode_Decode(buffer,ssize,charset,"replace");
if(pyobject_unicode==NULL)
{
printf("decode failed for: %s",buffer);
return NULL;
}
PyObject *pystring= PyUnicode_AsUTF8String(pyobject_unicode);
if(pystring == NULL)
{
printf("UTF-8 encode failed for: %s",buffer);
return NULL;
}
const char *encoded_str = PyString_AsString(pystring);
char *encoded_str_dup = strdup(encoded_str);
Py_DECREF(pystring);
Py_DECREF(pyobject_unicode);
printf("Encoded string: %s",encoded_str_dup);
int new_glength = g_utf8_strlen (encoded_str_dup, 9);
printf("new length = %d",new_glength);
char *test = "laxmi";
int new_len = g_utf8_strlen (test, 4);
printf("new = %d",new_len);
char *required_message = g_utf8_substring(encoded_str_dup, 0,3);
printf("final value = %s",required_message);
return encoded_str_dup;
}
int main()
{
// Initialize the Python Interpreter
Py_Initialize();
printf("here");
char *encoded_msg;
char *message1 = "象形字 xiàngxíngzì";
char *message = "象形字";
int len= strlen(message);
printf("length of unicode string = %d\n",len);
char *charset = "UTF-8";
encoded_msg = get_encoded_msg(message, charset);
}
No comments:
Post a Comment