Cohere AI

A Ruby gem for interacting with Cohere AI.

This Gem is designed to provide low-level access to Cohere AI, enabling people to build abstractions on top of it. If you are interested in more high-level abstractions or more user-friendly tools, you may want to consider Nano Bots 💎 🤖.

TL;DR and Quick Start

gem 'cohere-ai', '~> 1.1.0'

require 'cohere-ai'

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: { server_sent_events: true }
)

result = client.chat(
  { model: 'command', message: 'Hi!' }
)

Result:

{ 'response_id' => '59aac3ec-74c8-4b65-a80c-d4c1916d2511',
  'text' =>
  "Hi there! I'm Coral, an AI-assistant chatbot ready to help you with whatever you need. Is there anything I can assist you with today? \n" \
    "\n" \
    "If you'd like, I can also provide you with a list of some common topics that I can help with if you're looking for ideas. \n" \
    "\n" \
    "Just let me know if there's anything I can do to help make your day easier!",
  'generation_id' => '8a72b85c-ed29-4258-bd52-82b81be2353b',
  'token_count' => { 'prompt_tokens' => 64, 'response_tokens' => 82, 'total_tokens' => 146, 'billed_tokens' => 135 },
  'meta' => { 'api_version' => { 'version' => '1' },
              'billed_units' => { 'input_tokens' => 53, 'output_tokens' => 82 } },
  'tool_inputs' => nil }

Index

TL;DR and Quick Start
Index
Setup
- Installing
- Credentials
Usage
- Client
  - Custom Address
- Methods
  - chat
    - Without Streaming Events
    - Receiving Stream Events
  - generate
  - embed
  - rerank
  - classify
  - detect_language
  - summarize
  - tokenize
  - detokenize
- Datasets
- Connectors
- Streaming and Server-Sent Events (SSE)
  - Server-Sent Events (SSE) Hang
- Back-and-Forth Conversations
- New Functionalities and APIs
- Request Options
  - Adapter
  - Timeout
- Error Handling
  - Rescuing
  - For Short
  - Errors
Development
- Purpose
- Publish to RubyGems
- Updating the README
Resources and References
Disclaimer

Setup

Installing

gem install cohere-ai -v 1.1.0

gem 'cohere-ai', '~> 1.1.0'

Credentials

You can obtain your API key from the Cohere AI Platform.

Usage

Client

Ensure that you have an API Key for authentication.

Create a new client:

require 'cohere-ai'

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: { server_sent_events: true }
)

Custom Address

You can use a custom address:

require 'cohere-ai'

client = Cohere.new(
  credentials: {
    address: 'https://api.cohere.ai',
    api_key: ENV['COHERE_API_KEY']
  },
  options: { server_sent_events: true }
)

Methods

chat

Documentation: https://docs.cohere.com/reference/chat

Without Streaming Events

result = client.chat(
  { model: 'command', message: 'Hi!' }
)

Result:

{ 'response_id' => '59aac3ec-74c8-4b65-a80c-d4c1916d2511',
  'text' =>
  "Hi there! I'm Coral, an AI-assistant chatbot ready to help you with whatever you need. Is there anything I can assist you with today? \n" \
    "\n" \
    "If you'd like, I can also provide you with a list of some common topics that I can help with if you're looking for ideas. \n" \
    "\n" \
    "Just let me know if there's anything I can do to help make your day easier!",
  'generation_id' => '8a72b85c-ed29-4258-bd52-82b81be2353b',
  'token_count' => { 'prompt_tokens' => 64, 'response_tokens' => 82, 'total_tokens' => 146, 'billed_tokens' => 135 },
  'meta' => { 'api_version' => { 'version' => '1' },
              'billed_units' => { 'input_tokens' => 53, 'output_tokens' => 82 } },
  'tool_inputs' => nil }

Receiving Stream Events

Ensure that you have enabled Server-Sent Events before using blocks for streaming. You also need to add stream: true in your payload:

client.chat(
  { model: 'command', stream: true, message: 'Hi!' }
) do |event, raw|
  puts event
end

Event:

{ 'is_finished' => false,
  'event_type' => 'stream-start',
  'generation_id' => '0672df6f-736e-4536-8ad1-36bc808a114d' }

{ 'is_finished' => false,
  'event_type' => 'text-generation',
  'text' => 'Hi' }

{ 'is_finished' => true,
  'event_type' => 'stream-end',
  'response' => {
    'response_id' => '7dca6bf9-8a21-4f81-9e4e-2c5b8eb8c767',
    'text' => "Hi there! How can I help you today? I'm glad to assist you with any questions or tasks you have in mind, and I'm always happy to have a friendly conversation too. Go ahead and let me know how I can make your day better!",
    'generation_id' => '0672df6f-736e-4536-8ad1-36bc808a114d',
    'token_count' => {
      'prompt_tokens' => 64, 'response_tokens' => 52, 'total_tokens' => 116, 'billed_tokens' => 105
    },
    'tool_inputs' => nil
  },
  'finish_reason' => 'COMPLETE' }

You can get all the receive events at once as an array:

result = client.chat(
  { model: 'command', stream: true, message: 'Hi!' }
)

Result:

[{ 'is_finished' => false, 'event_type' => 'stream-start', 'generation_id' => '7f0349ef-a823-4779-801a-6e0a56c016a9' },
 { 'is_finished' => false, 'event_type' => 'text-generation', 'text' => 'Hi' },
 # ...
 { 'is_finished' => false, 'event_type' => 'text-generation', 'text' => '?' },
 { 'is_finished' => true,
   'event_type' => 'stream-end',
   'response' =>
   { 'response_id' => 'f4135de4-3631-4f7c-98be-deb50b7f986b',
     'text' =>
     "Hi there! How can I assist you today? I'm programmed to provide helpful, fact-based responses and engage in conversations on a wide range of topics. Feel free to ask me anything, and we'll see how I can help you out! \n" \
       "\n" \
       'Do you have any requests or questions that I can help with?',
     'generation_id' => '7f0349ef-a823-4779-801a-6e0a56c016a9',
     'token_count' => { 'prompt_tokens' => 64, 'response_tokens' => 65, 'total_tokens' => 129, 'billed_tokens' => 118 },
     'tool_inputs' => nil },
   'finish_reason' => 'COMPLETE' }]

You can mix both as well:

result = client.chat(
  { model: 'command', stream: true, message: 'Hi!' }
) do |event, raw|
  puts event
end

generate

Documentation: https://docs.cohere.com/reference/generate

client.generate(
  { stream: true,
    truncate: 'END',
    return_likelihoods: 'NONE',
    prompt: 'Please explain to me how LLMs work' }
) do |event, raw|
  puts event
end

result = client.generate(
  { truncate: 'END',
    return_likelihoods: 'NONE',
    prompt: 'Please explain to me how LLMs work' }
)

Result:

{ 'id' => '9caef8a6-07fb-465c-a003-5fd7c7bbccbd',
  'generations' =>
  [{ 'id' => '11b043ab-6d9c-415f-90b0-6a802a3807f7',
     'text' =>
     " LLMs, or Large Language Models, are a type of neural network architecture designed to understand and generate human-like language. They are trained by feeding them massive amounts of text data and adjusting their internal parameters to predict the next word in a sequence accurately, given the words that came before it. This process is known as language modeling.\n" \
       "\n" \
       "There are two main types of LLMs: autoregressive and generative models. Autoregressive models, such as the long short-term memory (LSTM) network and the gated recurrent unit (GRU), predict the next word in a sequence by relying solely on the past context. On the other hand, generative models, such as the variational autoencoder (VAE) and the generative adversarial network (GAN), can generate new data similar to the training data.\n" \
       "\n" \
       "One of the key features of LLMs is their ability to capture long-range dependencies in text, which means they can understand and generate sentences that rely on information from words far away in the sequence. This is particularly useful for tasks like language translation, summarization, and question-answering, where understanding the context is crucial.\n" \
       "\n" \
       "To improve the accuracy and performance of LLMs, researchers and engineers often use techniques like pre-training and transfer learning. During pre-training, the LLM is trained on a large and diverse dataset, such as books, articles, and websites, to learn general language patterns and relationships. Once the LLM is pre-trained, it can be fine-tuned on specific tasks with smaller datasets related to the task, such as medical texts or legal documents. This way, the LLM can leverage its knowledge of language understanding while adapting to the specific domain or task at hand.\n" \
       "\n" \
       "Overall, LLMs have had a significant impact on the field of natural language processing, allowing researchers to build more sophisticated models that can understand and generate language in ways that were previously impossible. However, it is important to note that LLMs are sophisticated tools that require large amounts of data, computational power, and careful tuning to achieve their full potential. Moreover, they can also inherit the biases present in the training data, highlighting the importance of data diversity and responsible LLM development and usage. \n" \
       "\n" \
       'Would you like me to go into more detail about any specific aspect of LLM functionality? ',
     'finish_reason' => 'COMPLETE' }],
  'prompt' => 'Please explain to me how LLMs work',
  'meta' => { 'api_version' => { 'version' => '1' },
              'billed_units' => { 'input_tokens' => 8, 'output_tokens' => 476 } } }

embed

Documentation: https://docs.cohere.com/reference/embed

result = client.embed(
  { texts: ['hello', 'goodbye'],
    model: 'embed-english-v3.0',
    input_type: 'classification' }
)

Result:

{ 'id' => '5d77f09a-3518-44d3-bac3-f868b3e036bc',
  'texts' => ['hello', 'goodbye'],
  'embeddings' =>
  [[0.016296387,
    -0.008354187,
    # ...
    -0.01802063,
    0.009765625],
   [0.04663086,
    -0.023239136,
    # ...
    0.0023212433,
    0.0052719116]],
  'meta' =>
  { 'api_version' => { 'version' => '1' }, 'billed_units' => { 'input_tokens' => 2 } },
  'response_type' => 'embeddings_floats' }

rerank

Documentation: https://docs.cohere.com/reference/rerank

result = client.rerank(
  { return_documents: false,
    max_chunks_per_doc: 10,
    model: 'rerank-english-v2.0',
    query: 'What is the capital of the United States?',
    documents: [
      'Carson City is the capital city of the American state of Nevada.',
      'The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.',
      'Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.',
      'Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states.'
    ] }
)

Result:

{ 'id' => 'ba2100fe-69d2-41d6-8680-810948af872d',
  'results' =>
  [{ 'index' => 2, 'relevance_score' => 0.98005307 },
   { 'index' => 3, 'relevance_score' => 0.27904198 },
   { 'index' => 0, 'relevance_score' => 0.10194652 },
   { 'index' => 1, 'relevance_score' => 0.0721122 }],
  'meta' =>
  { 'api_version' => { 'version' => '1' }, 'billed_units' => { 'search_units' => 1 } } }

classify

Documentation: https://docs.cohere.com/reference/classify

result = client.classify(
  {
    truncate: 'END',
    inputs: [
      'Confirm your email address',
      'hey i need u to send some $'
    ],
    examples: [
      {
        text: "Dermatologists don't like her!",
        label: 'Spam'
      },
      {
        text: 'Hello, open to this?',
        label: 'Spam'
      },
      {
        text: 'I need help please wire me $1000 right now',
        label: 'Spam'
      },
      {
        text: 'Nice to know you ;)',
        label: 'Spam'
      },
      {
        text: 'Please help me?',
        label: 'Spam'
      },
      {
        text: 'Your parcel will be delivered today',
        label: 'Not spam'
      },
      {
        text: 'Review changes to our Terms and Conditions',
        label: 'Not spam'
      },
      {
        text: 'Weekly sync notes',
        label: 'Not spam'
      },
      {
        text: 'Re: Follow up from today’s meeting',
        label: 'Not spam'
      },
      {
        text: 'Pre-read for tomorrow',
        label: 'Not spam'
      }
    ]
  }
)

Result:

{ 'id' => '1d0c322e-9ab7-4a80-b601-720e7592dde5',
  'classifications' =>
  [{ 'classification_type' => 'single-label',
     'confidence' => 0.8082329,
     'confidences' => [0.8082329],
     'id' => '44b2e358-83f1-4b9b-a6f1-e8d9c823f0ba',
     'input' => 'Confirm your email address',
     'labels' =>
     { 'Not spam' => { 'confidence' => 0.8082329 },
       'Spam' => { 'confidence' => 0.19176713 } },
     'prediction' => 'Not spam',
     'predictions' => ['Not spam'] },
   { 'classification_type' => 'single-label',
     'confidence' => 0.9893421,
     'confidences' => [0.9893421],
     'id' => '53ad6d6e-872f-4185-b3d6-6d03fe55f4c4',
     'input' => 'hey i need u to send some $',
     'labels' =>
     { 'Not spam' => { 'confidence' => 0.01065793 },
       'Spam' => { 'confidence' => 0.9893421 } },
     'prediction' => 'Spam',
     'predictions' => ['Spam'] }],
  'meta' =>
  { 'api_version' => { 'version' => '1' }, 'billed_units' => { 'classifications' => 2 } } }

detect_language

Documentation: https://docs.cohere.com/reference/detect-language

result = client.detect_language(
  { texts: ['Hello world', "'Здравствуй, Мир'"] }
)

Result:

{ 'id' => '82b111d4-ba6d-4f60-9536-883b42f3d86d',
  'results' =>
  [{ 'language_code' => 'en', 'language_name' => 'English' },
   { 'language_code' => 'ru', 'language_name' => 'Russian' }],
  'meta' => { 'api_version' => { 'version' => '1' } } }

summarize

Documentation: https://docs.cohere.com/reference/summarize

result = client.summarize(
  { text: "Ice cream is a sweetened frozen food typically eaten as a snack or dessert. It may be made from milk or cream and is flavoured with a sweetener, either sugar or an alternative, and a spice, such as cocoa or vanilla, or with fruit such as strawberries or peaches. It can also be made by whisking a flavored cream base and liquid nitrogen together. Food coloring is sometimes added, in addition to stabilizers. The mixture is cooled below the freezing point of water and stirred to incorporate air spaces and to prevent detectable ice crystals from forming. The result is a smooth, semi-solid foam that is solid at very low temperatures (below 2 °C or 35 °F). It becomes more malleable as its temperature increases.\n\nThe meaning of the name \"ice cream\" varies from one country to another. In some countries, such as the United States, \"ice cream\" applies only to a specific variety, and most governments regulate the commercial use of the various terms according to the relative quantities of the main ingredients, notably the amount of cream. Products that do not meet the criteria to be called ice cream are sometimes labelled \"frozen dairy dessert\" instead. In other countries, such as Italy and Argentina, one word is used fo\r all variants. Analogues made from dairy alternatives, such as goat's or sheep's milk, or milk substitutes (e.g., soy, cashew, coconut, almond milk or tofu), are available for those who are lactose intolerant, allergic to dairy protein or vegan." }
)

Result:

{ 'id' => '5189bed8-2d12-47d8-8ae7-cf60788cf507',
  'summary' =>
  'Ice cream is a popular frozen dessert made from milk or cream, sweeteners, and spices like vanilla or cocoa, or fruit like strawberries. It is cooled below the freezing point of water and stirred to incorporate air spaces, resulting in a smooth semi-solid foam. While the name applies to just one variety in some countries, like the US, it refers to all variants in others, like Italy. Additionally, there are dairy alternatives available for those who are lactose intolerant or vegan. Ice cream is regulated by governments based on the quantities of its main ingredients.',
  'meta' =>
  { 'api_version' => { 'version' => '1' },
    'billed_units' => { 'input_tokens' => 321, 'output_tokens' => 111 } } }

tokenize

Documentation: https://docs.cohere.com/reference/tokenize

result = client.tokenize(
  { text: 'tokenize me! :D', model: 'command' }
)

Result:

{ 'tokens' => [10_002, 2261, 2012, 8, 2792, 43],
  'token_strings' => ['token', 'ize', ' me', '!', ' :', 'D'],
  'meta' => { 'api_version' => { 'version' => '1' } } }

detokenize

Documentation: https://docs.cohere.com/reference/detokenize

result = client.detokenize(
  { tokens: [10_104, 12_221, 1315, 34, 1420, 69], model: 'command' }
)

Result:

{ 'text' => ' Anton Mun🟣;🥭^', 'meta' => { 'api_version' => { 'version' => '1' } } }

Datasets

Documentation: https://docs.cohere.com/reference/create-dataset

result = client.request(
  'v1/dataset',
  request_method: 'GET', server_sent_events: false
)

result = client.request(
  'v1/dataset',
  { name: 'prompt-completion-dataset',
    data: File.read('./prompt-completion.jsonl'),
    dataset_type: 'prompt-completion-finetune-input' },
  request_method: 'POST', server_sent_events: false
)

Connectors

Documentation: https://docs.cohere.com/reference/list-connectors

result = client.request(
  'v1/connectors',
  request_method: 'GET', server_sent_events: false
)

result = client.request(
  'v1/connectors',
  { name: 'test-connector',
    url: 'https://example.com/search',
    description: 'A test connector' },
  request_method: 'POST', server_sent_events: false
)

Streaming and Server-Sent Events (SSE)

Server-Sent Events (SSE) is a technology that allows certain endpoints to offer streaming capabilities, such as creating the impression that "the model is typing along with you," rather than delivering the entire answer all at once.

You can set up the client to use Server-Sent Events (SSE) for all supported endpoints:

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: { server_sent_events: true }
)

Or, you can decide on a request basis:

client.chat(
  { model: 'command', stream: true, message: 'Hi!' }
  server_sent_events: true
) do |event, raw|
  puts event
end

With Server-Sent Events (SSE) enabled, you can use a block to receive partial results via events. This feature is particularly useful for methods that offer streaming capabilities, such as chat: Receiving Stream Events

Server-Sent Events (SSE) Hang

Method calls will hang until the server-sent events finish, so even without providing a block, you can obtain the final results of the received events: Receiving Stream Events

Back-and-Forth Conversations

To maintain a back-and-forth conversation, you need to append the received responses and build a history for your requests:

result = client.chat(
  { model: 'command',
    chat_history: [
      { role: 'USER', message: 'Hi, my name is Purple.' },
      { role: 'CHATBOT', message: "Hi Purple! It's nice to meet you." }
    ],
    message: "What's my name?" }
)

Result:

{ 'response_id' => 'a2009c24-fee3-465e-9945-a85edcdcb3cf',
  'text' =>
  "Your name is Purple. Isn't it an enchanting name? \n" \
    "\n" \
    "Would you like me to help you with anything else? If you have any questions or need assistance with a particular task, feel free to let me know. I'm here to help!",
  'generation_id' => 'e1630c10-f773-4ced-8760-19199004653a',
  'token_count' => { 'prompt_tokens' => 91, 'response_tokens' => 51, 'total_tokens' => 142, 'billed_tokens' => 124 },
  'meta' => { 'api_version' => { 'version' => '1' },
              'billed_units' => { 'input_tokens' => 73, 'output_tokens' => 51 } },
  'tool_inputs' => nil }

New Functionalities and APIs

Cohere may launch a new endpoint that we haven't covered in the Gem yet. If that's the case, you may still be able to use it through the request method. For example, chat is just a wrapper for v1/chat, which you can call directly like this:

result = client.request(
  'v1/chat',
  { model: 'command', message: 'Hi!' },
  request_method: 'POST', server_sent_events: true
)

Request Options

Adapter

The gem uses Faraday with the Typhoeus adapter by default.

You can use a different adapter if you want:

require 'faraday/net_http'

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: { connection: { adapter: :net_http } }
)

Timeout

You can set the maximum number of seconds to wait for the request to complete with the timeout option:

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: { connection: { request: { timeout: 5 } } }
)

You can also have more fine-grained control over Faraday's Request Options if you prefer:

client = Cohere.new(
  credentials: { api_key: ENV['COHERE_API_KEY'] },
  options: {
    connection: {
      request: {
        timeout: 5,
        open_timeout: 5,
        read_timeout: 5,
        write_timeout: 5
      }
    }
  }
)

Error Handling

Rescuing

require 'cohere-ai'

begin
  client.chat(
    { model: 'command', message: 'Hi!' }
  )
rescue Cohere::Errors::CohereError => error
  puts error.class # Cohere::Errors::RequestError
  puts error.message # 'the server responded with status 500'

  puts error.payload
  # { model: 'command',
  #   message: 'Hi!'
  #   ...
  # }

  puts error.request
  # #<Faraday::ServerError response={:status=>500, :headers...
end

For Short

require 'cohere-ai/errors'

begin
  client.chat(
    { model: 'command',
      messages: [{ role: 'user', content: 'hi!' }] }
  )
rescue CohereError => error
  puts error.class # Cohere::Errors::RequestError
end

Errors

CohereError

MissingAPIKeyError
BlockWithoutServerSentEventsError
IncompleteJSONReceivedError

RequestError

Development

bundle
rubocop -A

bundle exec ruby spec/tasks/run-client.rb

Purpose

This Gem is designed to provide low-level access to Cohere, enabling people to build abstractions on top of it. If you are interested in more high-level abstractions or more user-friendly tools, you may want to consider Nano Bots 💎 🤖.

Publish to RubyGems

gem build cohere-ai.gemspec

gem signin

gem push cohere-ai-1.1.0.gem

Updating the README

Install Babashka:

curl -s https://raw.githubusercontent.com/babashka/babashka/master/install | sudo bash

Update the template.md file and then:

bb tasks/generate-readme.clj

Trick for automatically updating the README.md when template.md changes:

sudo pacman -S inotify-tools # Arch / Manjaro
sudo apt-get install inotify-tools # Debian / Ubuntu / Raspberry Pi OS
sudo dnf install inotify-tools # Fedora / CentOS / RHEL

while inotifywait -e modify template.md; do bb tasks/generate-readme.clj; done

Trick for Markdown Live Preview:

pip install -U markdown_live_preview

mlp README.md -p 8076

Resources and References

These resources and references may be useful throughout your learning process.

Disclaimer

This is not an official Cohere project, nor is it affiliated with Cohere in any way.

This software is distributed under the MIT License. This license includes a disclaimer of warranty. Moreover, the authors assume no responsibility for any damage or costs that may result from using this project. Use the Cohere AI Ruby Gem at your own risk.

cohere-ai

Runtime

Cohere AI

TL;DR and Quick Start

Index

Setup

Installing

Credentials

Usage

Client

Custom Address

Methods

chat

Without Streaming Events

Receiving Stream Events

generate

embed

rerank

classify

detect_language

summarize

tokenize

detokenize

Datasets

Connectors

Streaming and Server-Sent Events (SSE)

Server-Sent Events (SSE) Hang

Back-and-Forth Conversations

New Functionalities and APIs

Request Options

Adapter

Timeout

Error Handling

Rescuing

For Short

Errors

Development

Purpose

Publish to RubyGems

Updating the README

Resources and References

Disclaimer