To extract similar words between Orthographic languages along with their distance by using provided corpora with the help of Longest Common Substring (LCS) using Suffix Trees and n-gram.
For more information read the report and PPT attached with the code.