Tokens created by lexical analysis have two parts, one is the block of text corresponding to the token(Known as lexeme) and the other is the token type. So to give an example, a sequence like "a=1+2" after tokenisation will look like
2 numericconstantNow that we understans the basic concept of lexical anlysis, we might be tempted to write a Lexer :) . Writing lexers though possible might be a waste of time at times. There are programs known as "Lexer generators", yes u guessed it write, these are programs that generate lexer programs.
The lexer generators take a input file with details about the required characteristics of a lexer and generate the lexer. One lexer that seems to be popular is "flex". The input file for flex consists of 3 parts, the definiton part, the rules part and the user code section.
Each of the parts is separated by "%%". So a simple program in flex would be like
This just makes "sayhello" a token that has the action printf("Hello");. The flex has much much more that can be done. There are features to optimise the memory usage to suit the lexer u design.