Friday, July 12, 2024
- Advertisement -

    Latest Posts

    Here’s what you need to know about Apple’s family of open source large language models OpenELM

    Apple has released OpenELM (Open-source Efficient Language Models), a family of eight open-source large language models (LLMs), on Hugging Face. The OpenELM family of models covers a parameter range of 270 million and 3 billion parameters (the weights and biases that a model learns during training). According to the company’s research paper on OpenELM, it outperforms Allen Institute for AI’s OLMo Models by 2.36% while requiring two times fewer pre-training tokens.

    OpenELM consists of four pre-trained models and four instruct models. Pre-trained models are artificial intelligence (AI) models trained on large data sets to perform a specific task and can be further tuned for specific tasks. On the other hand, instruct models are AI models that have been fine-tuned to follow prompted instructions.

    Data used for training OpenELM models:

    The models have been trained on publicly available data sets including Wikipedia, Wikibooks, Reddit, Github, the open-access archive for scholarly articles, and Project Gutenberg. Given the use of publicly available data sets, Apple mentions that the models do not have any safety guarantees. It states that there is a possibility that these models could produce inaccurate, harmful, biased, or objectionable outputs in response to user prompts. “Thus, it is imperative for users and developers to undertake thorough safety testing and implement appropriate filtering mechanisms tailored to their specific requirements,” the company mentions.

    How ‘open’ is Apple’s OpenELM?

    “Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations,” Apple’s research paper on OpenELM says.

    The model is available under the Apple Sample Code License, which says that developers/users are allowed to use, reproduce, modify and redistribute these models with or without making changes. However, if the models are redistributed in their entirety without modifications, they must retain the license and certain disclaimers in all the redistributions.

    This makes Apple’s OpenELM different from Meta’s open-source releases, which have been questioned for having restrictions baked into the license. Even Meta’s latest open-source release, Llama 3, puts certain restrictions on commercial usage. Its license states that those app developers whose products/services have more than 700 million active monthly users would have to request a special license from Meta.

    Apple’s journey in the AI race:

    Apple has notably not released any commercial models, unlike its competitors, like Google, and Microsoft, which have models like Gemini and Phi-3 respectively. The company has, however, been expanding its roster of AI businesses by acquiring 32 AI firms in 2023. Most recently, it acquired Darwin AI, the Canadian AI startup, which creates AI technologies for visually inspecting components during manufacturing and works on ways to make AI systems smaller and faster.

    Besides these acquisitions, Apple has also reportedly been in talks with Google to incorporate Gemini into iPhones. If this collaboration were to go through, Apple would license Gemini to power some of the new features coming to the iPhone software this year.

    STAY ON TOP OF TECH NEWS: Our daily newsletter with the top story of the day from MediaNama, delivered to your inbox before 9 AM. Click here to sign up today!

    Also read:

    The post Here’s what you need to know about Apple’s family of open source large language models OpenELM appeared first on MediaNama.

    Latest Posts

    - Advertisement -

    Don't Miss

    Stay in touch

    To be updated with all the latest news, offers and special announcements.