Jai Hind! Jai Gyan! India on the Internet Archive
In India, many speeches begin or end with the phrase “Jai Hind.” Jai means “long live” and “Hind” of course is the great Republic of India, the largest democracy in the world. Jawaharlal Nehru popularized this phrase, and it became the battle cry of the fight for liberation. “Gyan” means knowledge, and I have taken to ending my speeches in India with “Jai Hind! Jai Gyan!”
In this blog post, I’d like to tell you a little bit about my work in India, talk about some of the amazing resources having to do with India on the Internet Archive, and recognize some of my colleagues who have shown me so much kindness and given me so much inspiration.
India was not a complete stranger to me. It was one of the stops in my Internet travelogue Exploring the Internet, I finished one of my first books on a houseboat in Srinigar, and India played an important role in the Internet World’s Fair, and I was honored to have His Holiness the Dalai Lama write the foreword to my book about the fair and allowed me to present him a copy in Dharamshala.
For the last years however, I’ve been spending a great deal of my time in this amazing country. I wrote about my passage to India last year in the book Code Swaraj, which is of course available for free, no rights reserved, and has been translated into Hindi, Urdu, Bangla, Punjabi, Gujarati, Kannada, Martha, Tamil, and Telugu. My fascination with and commitment to India comes from many sources, but I was particularly inspired by the movement for liberation and the lessons we can learn from Mahatma Gandhi for how one can confront authority and change the world.
My work in India has been made possible by a generous grant from the Arcadia, to whom I am immensely grateful for allowing me to pursue this work. Arcadia has been instrumental in promoting open access throughout the world, including support for the Internet Archive and many other groups.
The Internet Archive now hosts one of the very largest collections of materials by and about India. Let me tell you about a few of these:
- The Public Library of India is a collection of books mirrored from the Internet in over 100 languages. Many of those books we archived are no longer available in their original locations, so we are thrilled that the Internet Archive became a home for these valuable materials. Some of the scans are old, some of the metadata and quality control is not so good, but the materials are unique and our 425,121 texts have received over 62 million views.
- Because the metadata on the Public Library of India had been entered in Roman characters, that made the collection much less useful for non-Roman scripts, a team of Wikipedians led by my friend Arjuna Rao Chavala painstakingly reentered the titles and creators for 17,655 books in Telugu into the original script, making the books findable for those that speak those languages. That effort is now being replicated for other languages, such as Kannada and Tamil.
- One of my personal passions has been the Hind Swaraj collection, which is devoted to the fight for Indian independence. The collection features 595 works about Gandhi Ji including all 100 volumes of his Complete Works, as well as the complete works of Nehru, Ambedkar, and substantial collections of texts and audio by figures such as Rabindranath Tagore, Sarvepalli Radhakrishnan, Subhas Chandra Bose, and Saradar Patel.
- Public Resource has been honored to work closely with the Indian Academy of Sciences, with whom we have a formal Memorandum of Cooperation. As part of that effort, we have digitized all the Indian Academy’s books, and maintain an extensive collection of science resources of India. It has a great pleasure to work with and become friends with my colleagues at the Academy, including the distinguished Professor Amitabh Joshi who has spearheaded this effort and Professor Partha Majumder, the President. We have in place similar Memoranda of Cooperation and have hosted collections for the JC Bose Trust and the National Center for Biological Sciences.
One of the most gratifying things about working in India is the public spirit, technical skills, and enthusiasm of volunteers all over the country. We have banded together and call ourselves the Servants of Knowledge, a hat-tip to Gokhale’s Servants of India society. The Indian Academy allowed kindly allowed us to place a Table Top Scribe in their Bengaluru headquarters, a unit which was donated by the Kahle Austin foundation. It has been a true delight to work with and learn about the Internet Archive digitization framework which is extensive and incredibly powerful.
The Servants of Knowledge collection and the scanning effort in Bengaluru is managed by my friend Omshivaprakash, a long-time wikipedian and a passionate advocate for the Kannada language and heritage. Likewise, Shiju Alex has been toiling for years to digitize key works in Malayalam. In Mangaluru, Prashanth Shenoy has led the effort for Konkani texts, and in Chennai the indomitable T. Shrinivasan has long worked to make more Tamil resources available on the Internet.
Since, 2011, Public Resource has worked with Dr. Sushant Sinha, the founder of Indian Kanoon, the amazing free site that provides access to all case law and other legal materials in India. Indian Kanoon was recently honored with the prestigious Agami Prize for service to the citizens of India. Sushant and I have been working on a project for a year that we believe is transformational, pulling in the Official Gazettes of India from the central government and 19 states, an archive that is updated daily and has over 455 documents. (The Official Gazettes are the newspapers of government, akin to the Federal Register in the United States.)
Particularly impressive has been Sushant’s effort to extend the Internet Archive by doing OCR in Indian languages. He has written code that pulls a document off the Internet Archive and bounces it off Google Vision for the OCR, then recreates the files that the Archive would expect to see if it had done the OCR in the Abby software it uses. The code is now working, and he’s been applying it to mixed-language Gazettes in Hindi and English and to the Karnataka Gazette in the Kannada lanaguge. The code is totally open source, we are beginning to apply it to books, and we are hoping to supplement Google Vision with tesseract and other modules.
Public Resource has two other major efforts in India, both of which we believe have the potential to be transformational not only in India but in the rest of the world.
- In Delhi, we have a formal memorandum of research cooperation with Dr. Andrew Lynn of Jawaharlal Nehru University where we have created the JNU Data Depot, an effort to advance text and data mining on the scientific corpus by researchers. The system is carefully modeled after the Hathi Trust effort in the U.S. and makes carefully secured access to the corpus available to non-commercial university researchers who are able to perform non-consumptive text and data mining. This project was recently featured in Nature, the international journal of science. In addition to the JNU facility, Public Resource has installed a mirror at IIT Delhi, under the direction of Dr. Sanjiva Prasad. We have a distinguished board of advisors from universities throughout India and have received legal advice and counsel from some of the most distinguished intellectual property experts in the country, including Professor Arul George Scaria, Professor N. S. Gopalakrishnan, Professor Feroz Ali, and Professor Lawrence Liang. I have also been grateful for the personal insights and friendship provided to me by Dr. Zakir Thomas, a senior civil servant and the former Registrar of Copyrights for India.
- One of our initial programs in India has been to make available all Indian Standards, the public safety codes of India. We have made 18,471 such standards available in our Public Safety Codes Collection and the documents have been invaluable for millions of Indian students, government officials, and others who need to consult these valuable government-issued rules and regulations. We have filed a public interest litigation writ petition before the Hon’ble High Court of Delhi after the government objected to our efforts. My co-petitioners are Dr. Sinha and my friend Srinivas Kodali. We are represented before the Hon’ble High Court by senior advocates Jawahar Raja and Salman Khurshid and the law firm of Nishith Desai and Associates.
You can read more about my efforts in India on the Public Resource Docket where I keep a listing of speeches, press, and other public information.
I close on a sad note. India lost a remarkable person this year and I lost a dear friend. Shamand Basheer passed away at the young age of 43 after a very long illness. In his short time on earth he touched so many lives, mine included.
Shamnad had the finest legal mind of anybody I have ever met, and I have had the privilege of working with many of the best. Shamnad, though, was on a wholly different level. He was considered to be the leading intellectual property expert in India, but he also pursued justice in many other areas of the law. He filed a petition that challenged discrimination in law school admissions, he intervened in the landmark Novartis case, he played an instrumental role in the Delhi University Photocopy case.
But he did so much more. His greatest accomplishment is IDIA, which he created and spent his greatest efforts with. IDIA’s mission is increasing diversity by increasing access to legal education. They find young students with great potential but living in impoverished circumstances, get them ready to take the exam to get into law school, then stick with them to get them through the program. These young lawyers then go back to their communities to provide justice. It is an immensely inspirational program and you can do nothing better to honor Shamnad’s memory than donate to IDIA.
I read a lot about Gandhi, I speak about him frequently and learned much from his work. Gandhi is an important part of my life, but with Shamnad it was different. Shamand, more than anybody I ever met, lived his life like Gandhi. He was a public worker, devoted to justice and equality. He knew a vast number of people, and every person he met, he touched their lives deeply.
He had been dreadfully sick for so many years, but he did not let that stop him. I admonished him once when he invited me to speak at the IDIA annual event and he had just flown in the night before from Iran, where he was advising people on intellectual property. You should take it easy, I told him. He replied that he wasn’t going to let his illness rule his life, and he never stopped his public work for one minute.
The night before he passed away, we were exchanging messages on WhatsApp. He had been especially sick of late and had gone on a pilgrimage to Bababundangiri. His last words to me, which I wish to share were:
I am in a very special place right now. Even as my body is battered my spirit is strengthening. In Baba Budangiri a site of amazing syncretic spirituality. Where I have found much peace and meaning. And a place that has helped me transcend the body. And in this special place, I am offering prayers for you. And sending you lots of good energy. To continue the good fight. Lots of love. Shams
Wherever Shamnad is now, his star will shine bright for the ages. He will forever be missed and forever remembered. May the gods bless you Shamnad, thank you for all you did for me and for so many others.