#atom

Neural network-based systems trained to understand and generate human language

Core Idea: Large Language Models (LLMs) are neural network architectures trained on vast text corpora that can understand, generate, and manipulate human language with remarkable fluency and contextual awareness, enabling them to perform a wide range of language-based tasks without task-specific training.

Key Elements

Technical Architecture

Capabilities and Limitations

Evolution and Progress

Applications

Connections

References

  1. Attention Is All You Need (Vaswani et al., 2017)
  2. Language Models are Few-Shot Learners (Brown et al., 2020)
  3. Training Language Models to Follow Instructions with Human Feedback (Ouyang et al., 2022)
  4. Scaling Laws for Neural Language Models (Kaplan et al., 2020)

#llm #nlp #deep-learning #transformer #language-models #foundation-models #ai


Connections:


Sources: