Unicode is very important as it is a standard that allows computers to represent and manipulate, consistently, the text of any existing system of writing.
Here I have presented the source code in C where the non ascii character can be displayed properly in the console. To display the non ascii character , we need to implement unicode encoding/decoding process. There can be variouus ways to Implement unicode in C, However here I have embedded Python to implement unicode.
To embeed python in C, we need to add Python.h header in the program. we need the
more detail on embedding python on c is described here
source code
#include <glib.h>
#include <Python/Python.h>
Here I have presented the source code in C where the non ascii character can be displayed properly in the console. To display the non ascii character , we need to implement unicode encoding/decoding process. There can be variouus ways to Implement unicode in C, However here I have embedded Python to implement unicode.
To embeed python in C, we need to add Python.h header in the program. we need the
python-dev
package which contains Python.h
to be installed prior to running this program.more detail on embedding python on c is described here
source code
#include <glib.h>
#include <Python/Python.h>
char *get_encoded_msg(char *buffer, char *charset) { Py_ssize_t ssize = (Py_ssize_t)strlen(buffer); PyObject *pyobject_unicode= PyUnicode_Decode(buffer,ssize,charset,"replace"); if(pyobject_unicode==NULL) { printf("decode failed for: %s",buffer); return NULL; } PyObject *pystring= PyUnicode_AsUTF8String(pyobject_unicode); if(pystring == NULL) { printf("UTF-8 encode failed for: %s",buffer); return NULL; } const char *encoded_str = PyString_AsString(pystring); char *encoded_str_dup = strdup(encoded_str); Py_DECREF(pystring); Py_DECREF(pyobject_unicode); printf("Encoded string: %s",encoded_str_dup); int new_glength = g_utf8_strlen (encoded_str_dup, 9); printf("new length = %d",new_glength); char *test = "laxmi"; int new_len = g_utf8_strlen (test, 4); printf("new = %d",new_len); char *required_message = g_utf8_substring(encoded_str_dup, 0,3); printf("final value = %s",required_message); return encoded_str_dup; } int main() { // Initialize the Python Interpreter Py_Initialize(); printf("here"); char *encoded_msg; char *message1 = "象形字 xiàngxíngzì"; char *message = "象形字"; int len= strlen(message); printf("length of unicode string = %d\n",len); char *charset = "UTF-8"; encoded_msg = get_encoded_msg(message, charset); }
No comments:
Post a Comment