Itihasa is a corpus of Sanskrit-English translation pairs extracted from Manmatha Nath Dutt's translations of The Ramayana and The Mahabharata. The original digitized volumes are available here. Occasionally, you might find syntactic errors in the shlokas or their translations. This is expected since OCR was used to extract text from the documents. If you want to help correct these errors, contact me. You can find more details about the dataset and its curation process in this paper.
Important Links: Start reading:@inproceedings{aralikatte-etal-2021-itihasa, title = "Itihasa: A large-scale corpus for {S}anskrit to {E}nglish translation", author = "Aralikatte, Rahul and de Lhoneux, Miryam and Kunchukuttan, Anoop and S{\o}gaard, Anders", booktitle = "Proceedings of the 8th Workshop on Asian Translation (WAT2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.wat-1.22", pages = "191--197" }