Caridy's Codex

Introducing Open Schema APIs: Bridging Traditional APIs and Language Models

In the rapidly evolving landscape of AI and machine learning, language models like OpenAI's GPT-4 have shown us the incredible possibilities of natural language understanding and generation. But as developers and engineers, how do we best leverage these technologies to enhance our existing systems and create more flexible, user-friendly experiences? One promising approach lies in the intersection of traditional APIs and large language models: Open Schema APIs.

Traditional APIs have been essential for systems integration, enabling data exchange in a structured, predictable way. Yet, for complex or generic services like search engines, data of interest is diverse and context-dependent, leading these APIs to deliver results in natural language. This format, while informative, lacks the structure APIs are renowned for. Furthermore, we anticipate a wave of new services based on large language models (LLMs) that will produce natural language content. This is where Open Schema APIs enter the picture, offering a potential solution to structure this rich yet unstructured data.

Understanding Open Schema APIs

Open Schema APIs are an innovative way of structuring the data delivered by APIs. This approach leverages the strengths of both structured data and modern language models, blending the two to deliver precise and tailored data in response to API calls.

At the heart of an Open Schema API is a user-defined schema, which is provided along with the API request. This schema serves as a 'blueprint' for how the user wants the returned data to be structured, allowing for a high degree of flexibility and specificity in the API's response.

For instance, consider a call to a search engine API inquiring about a company, let's say Nike. With an Open Schema API, the request would also include a schema defining what data the user is interested in and how it should be structured. For example, a JSON schema could specify a data structure that includes 'name', 'founding date', 'CEO', 'headquarters', and 'stock price'.

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "name": {
      "type": "string"
    },
    "founding_date": {
      "type": "string",
      "format": "date"
    },
    "CEO": {
      "type": "string"
    },
    "headquarters": {
      "type": "string"
    },
    "stock_price": {
      "type": "number"
    }
  },
  "required": ["name", "founding_date", "CEO", "headquarters", "stock_price"]
}

Upon receiving this request, the API uses this schema as a guide to return a structured response matching the schema provided, using a large language model to parse and format the relevant data. The end-user sees a structured data record of Nike, formatted exactly as per their specifications.

{
    "name": "Nike, Inc.",
    "founding_date": "1964-01-25",
    "CEO": "John Donahoe",
    "headquarters": "Beaverton, OR",
    "stock_price": 105.10
}

What makes these APIs 'open' is the flexible, user-defined nature of the schemas. The data that can be extracted and its structure are dictated by the schema provided with each request, allowing users to tailor API responses to their specific needs.

In addition, the inclusion of the schema in the request can help the API itself to optimize the procurement of relevant text in natural language, effectively making the schema a part of the input to the search. Thus, the Open Schema API makes full use of the potential of modern language models not just for data extraction, but also for enhancing the quality of the data being fetched.

The Power of Open Schema APIs

Open Schema APIs offer a significant advancement over traditional APIs. They bridge the gap between the unstructured data often returned by general APIs and the structured, specific data that developers need for many applications.

Here are a few reasons why Open Schema APIs are a powerful tool:

  1. Flexibility: With Open Schema APIs, the user defines the schema they want the data to match. This allows for a high degree of customization in the API's responses, enabling users to get the exact data they need in the exact format they want.
  2. Efficiency: By directly returning data in the desired structure, Open Schema APIs can reduce the amount of data processing on the user's end. This streamlines the integration of API responses into existing systems and applications.
  3. Enhanced Search Quality: Including the schema as part of the API request can also help in procuring more relevant natural language text, thus enhancing the quality of the data being fetched.
  4. Simplicity: While an Open Schema API leverages a complex, fine-tuned language model internally, the complexity is abstracted away from the end user. The process of making a request and receiving structured data in response is simple and straightforward.

As for the applications of Open Schema APIs, they are broad and diverse. They can be used for data extraction from generic APIs, populating databases with structured records, processing natural language responses from chatbot APIs, and much more. The possibilities are extensive and largely depend on the needs of the user.

In the next section, we will take a deeper look at how an Open Schema API could be implemented in practice.

Emerging Trend: Language Model-Based APIs

As language models like GPT-4 advance, we predict a surge in APIs leveraging their capabilities, producing human-friendly, natural language responses. However, extracting specific data from these responses can be challenging. Open Schema APIs, which allow users to define their data format, can effectively address this issue, delivering structured, machine-friendly data. For developers and companies building APIs with natural language responses, incorporating Open Schema could significantly enhance usability and flexibility.

Interacting with Open Schema APIs

Now that we've covered the concept of Open Schema APIs and how they could work in an example, let's delve into some of the more nuanced aspects, such as handling mismatches and incomplete schemas.

Handling Mismatches and Incomplete Schemas

There may be times when the provided schema cannot be fully fulfilled. For instance, the user might request information that simply isn't available, or the search engine might not find exact matches for the schema fields. In such cases, the Open Schema API could return a partial result along with metadata about which fields could not be filled and why.

For example, if we don't have a stock price for Nike, Inc.:

{
    "data": {
        "name": "Nike, Inc.",
        "founding_date": "1964-01-25",
        "CEO": "John Donahoe",
        "headquarters": "Beaverton, OR"
    },
    "missing_fields": ["stock_price"],
    "reason": "The stock price information is not available at this time."
}

This approach gives the user feedback on the success of their request and transparency about what information could not be retrieved.

Implementing Open Schema APIs Using OpenAI GPT Models

If you're eager to test out the concept of Open Schema APIs today, OpenAI's GPT models with function calling capabilities provide a viable path forward.

Consider you want to use a generic API, like a search engine API, but you desire the result in a structured form as per your specific schema. You can create a new wrapper API around this existing generic API. Here's the process:

  1. Your wrapper API receives the user's query and the desired schema.
  2. It calls the existing API with the query to retrieve information in a generic form (e.g., natural language text).
  3. Your wrapper API then post-processes the result by sending it to the GPT model along with the schema, using the function calling capability of the model.

The function call to GPT would look like this:

curl https://api.openai.com/v1/chat/completions -u :$OPENAI_API_KEY -H 'Content-Type: application/json' -d '{
  "model": "gpt-3.5-turbo-0613",
  "temperature": 0,
  "messages": [
    {
      "role": "user",
      "content": "Nike, Inc. is an American multinational corporation that is engaged in the design, development, manufacturing, and worldwide marketing and sales of footwear, apparel, equipment, accessories, and services. The company is headquartered near Beaverton, Oregon, in the Portland metropolitan area. Wikipedia\\\\n\\\\nFounders: Phil Knight, Bill Bowerman\\\\nCustomer service: 1 (800) 806-6453\\\\nHeadquarters: Beaverton, OR\\\\nStock price: NKE (NYSE) $105.10 -2.00 (-1.87%)\\\\nJul 6, 4:00 PM EDT - Disclaimer\\\\nFounded: January 25, 1964, Eugene, OR\\\\nCEO: John Donahoe (Jan 13, 2020–)\\\\nRevenue: 37.4 billion USD\\\\nSubsidiaries: Converse, Nike Thailand, Nike Korea LLC"
    }
  ],
  "functions": [
    {
      "name": "format_output",
      "description": "extracts structured data from user-provided message",
      "parameters": {
        "type": "object",
        "properties": {
          "name": { "type": "string" },
          "founding_date": { "type": "string", "format": "date" },
          "CEO": { "type": "string" },
          "headquarters": { "type": "string" },
          "stock_price": { "type": "number" }
        },
        "required": [
          "name",
          "founding_date",
          "CEO",
          "headquarters",
          "stock_price"
        ]
      }
    }
  ],
  "function_call": { "name": "format_output" }
}
'

In this function call, format_output is the function that the GPT model should fulfill. It takes the user's message, which is the natural language text returned by the underlying API, and the parameters of the format_output function, which is the user-provided schema specifying the desired output format.

The GPT model processes this input, extracts the relevant information from the message content based on the schema, and calls the format_output function with the structured data as an argument. The relevant portion of the response looks like this:

{
  "role": "assistant",
  "content": null,
  "function_call": {
    "name": "format_output",
    "arguments": "{\n  \"name\": \"Nike, Inc.\",\n  \"founding_date\": \"January 25, 1964\",\n  \"CEO\": \"John Donahoe\",\n  \"headquarters\": \"Beaverton, OR\",\n  \"stock_price\": 105.10\n}"
  }
}

While this approach requires more setup and doesn't provide all the benefits of Open Schema APIs directly (as the extraction and structuring is done in another wrapping layer, not the original API itself), it showcases the potential of Open Schema APIs. It allows you to interact with existing APIs in a more structured and user-defined way, bringing us one step closer to the vision of Open Schema APIs.

Conclusion

Open Schema APIs offer an intriguing approach to enhancing the versatility, adaptability, and usability of APIs. By leveraging the powerful language comprehension and generation capabilities of large language models like OpenAI's GPT, APIs could potentially adapt dynamically to a user's specific data requirements. This could be done without requiring predefined schemas or substantial data transformations on the user's end.

The concept opens the door to exciting possibilities for intuitive and powerful API usage. For instance, search APIs could return exactly structured data based on user-provided schema, potentially saving developers countless hours of data wrangling. Similarly, other types of APIs could evolve to accommodate changing requirements without necessitating extensive updates or modifications to their existing structures.

Naturally, like any emerging technology, there are considerations and challenges that need to be addressed. Issues such as handling situations where the schema cannot be fulfilled, ensuring data integrity and accuracy, and training the underlying LLM to interpret and generate data accurately, all require careful thought and planning.

Nonetheless, the concept of Open Schema APIs brings a new level of potential flexibility and power to the world of APIs. While it remains an exploratory concept, its potential for making API interactions more efficient, flexible, and user-centric is certainly intriguing.

As of now, widespread adoption of Open Schema APIs is still a future possibility. However, developers can already begin exploring this concept using tools like OpenAI's GPT models. It's an exciting area of exploration that could potentially reshape our understanding of how APIs can work.

Whether Open Schema APIs become a standard in the future or not, they certainly present a new dimension of flexibility and adaptability in our interaction with APIs. As such, they deserve attention and exploration from forward-thinking developers and technologists.

Written by @caridy | July 7, 2023
Tags: ai, llm, prompting, api, schema, all