📚
Docs
  • Welcome
  • Santhosh Thottingal
    • Coding
    • Software I use
    • Research Papers
    • Talks
    • Projects
    • In news
    • Ideas
    • Books
  • Malayalam Computing
    • Unicode
      • Syllable
      • Conjunct
      • Articles
    • Input methods
      • Inscript
      • Swanalekha
      • Handwriting Recognition
        • Procrustes Analysis
      • Proprietory Input Methods
      • What is a good input method?
      • Typewriter
    • Script Rendering
      • Orthography
      • Ya Ra Va Signs
      • U signs
    • Type Design
      • Color Fonts
      • Curves
      • Design Ideas
      • Manjari
        • Gallery
      • Chilanka
      • Gayathri
      • Customize Malayalam fonts in Linux
      • Articles
      • Tools
      • Type classification
        • Display typefaces
    • Spellcheck
      • History
      • Dictionary based approach
      • Nature of Malayalam spelling mistakes
      • Morphology analyser based approach
      • Tools and services
      • Links
    • Hyphenation
      • Web page
    • Typesetting
      • LaTeX
      • Scribus
      • PDF
      • XeTeX
      • Indesign
      • Markup languages
    • Speech Recognition
    • Speech Synthesis
      • Dhvani
    • Collation
    • Corpus
    • Morphology Analysis
      • Mlmorph
        • Snippets
      • Part of speech tagging
      • Morphology complexity
    • Named Entity Recognition
    • Numbers
      • Number spellout
      • Hindi
    • Machine Translation
      • Neural Machine Translation
    • Optical Character recognition
    • Transliteration
    • Digitization
    • NLP
      • Low resource languages
      • Natural Language Generation
    • Grammar analysis
      • Style checkers
    • Dictionary
      • Lexicon
    • Natural Language Understanding
    • Natural Language Generation
    • Swathanthra Malayalam Computing
    • Meta
      • Malayalam Sign Language
      • പദനിർമിതി
      • History
      • ലിപിപരിണാമം നിലച്ചുപോയോ?
      • ഭാഷാ പഠനം
      • ശ്രേഷ്ഠ ഭാഷ
      • Dictionary
    • Encyclopedia
    • Government
      • Script
      • കേരള ഭാഷാ ഇൻസ്റ്റിറ്റ്യൂട്ട്
  • Academic Research
    • Knowledge Dissemination
    • Research papers
    • Reproducible Research
  • Arts
  • Books
  • Blockchain
  • Computer Science
    • Data, Information, Knowledge
    • Theory of computation
    • Compilers and Interpreters
    • Graphics
    • Data Visualization
    • Parsers
    • Data Structures & Algorithms
    • Finite State Transducer
  • Cyberspace
    • Digital Governance
    • കേരളത്തിൽ
    • Online Abuse
  • Databases
  • Education
    • Finite State Transducers
    • Digital Education
    • Digital Literacy
      • ഡിജിറ്റൽ സാക്ഷരതാ പദ്ധതി
      • Resources
    • Remote Learning
    • General Learning
  • Entertainment
  • Frontend technology
    • Colors
    • Design systems
    • CSS
    • PWA
    • SPA
    • Vue
  • Generative Graphics
    • Drawbot
    • Matrix Digital Rain
  • Hardware
  • Internet
    • Etiquettes
    • Privacy
    • IPFS
    • Resilience
    • Decentralization
    • Network debugging tools
  • Knowledge Representation
  • Languages & Scripts
    • Arabic
    • Vattezhuth
  • Life
    • Digital Minimalism
  • Linux
  • Machine learning
    • Neural Networks
    • Dialog systems, Information retrieval
    • Large Language Models
    • Embedding
    • ML in Production
    • Retrieval Augmented Generation
  • Mathematics
  • Music
  • Parenting
  • Politics
    • Hatred, Hinduthwa, Nationalism
  • Productivity
  • Problem Solving
  • Science
  • Software Libraries
  • Software Engneering
    • Architecture
    • Product Management
    • Docker
    • Programming
      • Javascript
    • People
    • Performance
    • Code Review
  • Web3
  • Web Typography
  • Writing
  • പാട്ടുകൾ
    • കുട്ടിപ്പാട്ടുകൾ
  • മലയാളം അച്ചടി
  • ഗവേഷണപ്രബന്ധങ്ങൾ
Powered by GitBook
On this page
  • RLHF
  • Local language - LLMS
  • Copyright
  • Courses
  1. Machine learning

Large Language Models

PreviousDialog systems, Information retrievalNextEmbedding

Last updated 1 year ago

  • -- A Transformative Reading List - Sebastian's whole site is very worth reading, start with this survey of LLM posts and literature

  • - Good idea to read everything Yoav has written but this is a great start

  • Figures Everyone Should Know https://github.com/ray-project/llm-numbers

  • Transformers from Scratch - This is the one I come back to every time. https://e2eml.school/transformers.html

  • Illustrated Word2Vec - Jay's site is extremely good, this one is particularly good for Word2Vec https://jalammar.github.io/illustrated-word2vec/

  • Attention? Attention! - Deep dive into the attention mechanism. A History of NLP - Great summary of the field over the last 20 or so years. https://lilianweng.github.io/posts/2018-06-24-attention/

  • Dive into Deep Learning Course https://d2l.ai/index.html

  • https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/

  • Indic-gemma-7b-Navarasa ,

https://ig.ft.com/generative-ai/

RLHF

https://towardsdatascience.com/rlhf-reinforcement-learning-from-human-feedback-faa5ff4761d1

"If we aim to match the performance of ChatGPT through open source, I believe we need to start taking training data more seriously. A substantial part of ChatGPT’s effectiveness might not come from, say, specific ML architecture, fine-tuning techniques, or frameworks. But more likely, it’s from the breadth, scale and quality of the instruction data.

To put it bluntly, fine-tuning large language models on mediocre instruction data is a waste of compute. Let’s take a look at what has changed in the training data and learning paradigm—how we are now formatting the training data differently and therefore learning differently than in past large-scale pre-training."

Local language - LLMS

  • Kannada LLAMA https://www.tensoic.com/blog/kannada-llama/

  • Malaysian Mistral https://github.com/mesolitica/research-paper/blob/master/malaysian-mistral.pdf

  • MaLLaM Malaysia Large Language Model https://github.com/mesolitica/research-paper/blob/master/mallam.pdf https://huggingface.co/mesolitica/mallam-1.1B-4096

  • Tamil LLAMA https://arxiv.org/abs/2311.05845 and later https://abhinand05.medium.com/breaking-language-barriers-introducing-tamil-llama-v0-2-and-its-expansion-to-telugu-and-malayalam-deb5d23e9264

  • Introducing Airavata: Hindi Instruction-tuned LLM https://ai4bharat.github.io/airavata/

  • Malayalam LLM https://github.com/VishnuPJ/MalayaLLM

  • AYA https://huggingface.co/CohereForAI/aya-101

Copyright

  • OpenAI says it’s “impossible” to create useful AI models without copyrighted material https://arstechnica.com/information-technology/2024/01/openai-says-its-impossible-to-create-useful-ai-models-without-copyrighted-material/ - Further, OpenAI writes that limiting training data to public domain books and drawings "created more than a century ago" would not provide AI systems that "meet the needs of today's citizens."

  • https://www.aisnakeoil.com/p/generative-ais-end-run-around-copyright We don’t think the injustice at the heart of generative AI will be redressed by the courts. Maybe changes to copyright law are necessary. Or maybe it will take other kinds of policy interventions that are outside the scope of copyright law. Either way, policymakers can’t take the easy way out.

Courses

  • https://github.com/mlabonne/llm-course

Understanding Large Language Models
A Primer on Neural Network Models for Natural Language Processing
Blog
Code