Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Document Object Model Integration #70

Open
AdamSobieski opened this issue Jan 9, 2025 · 0 comments
Open

[FR] Document Object Model Integration #70

AdamSobieski opened this issue Jan 9, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@AdamSobieski
Copy link

AdamSobieski commented Jan 9, 2025

What if, in addition to text-string prompts, document-object-model elements and/or document fragments could be used as prompts? This would enable model-independent multimodal prompting in a manner intuitive to Web developers.

Such multimodal prompts could utilize <p>, <img>, <picture>, <audio>, and <video> elements; perhaps <table> and its related elements; perhaps <html>, <head>, <meta>, <link>, and <body> elements; and, perhaps, <a> elements.

For example, to provide an <img> element in a prompt, one could utilize the data URI scheme:

const fragment = document.createDocumentFragment();
const img = document.createElement("img");
img.setAttribute("src", "data:image/png;base64,...");
fragment.append(img);

// ...

const result = await session.prompt(fragment);

or specify a URL:

const fragment = document.createDocumentFragment();
const img = document.createElement("img");
img.setAttribute("src", "http://www.example.com/images/123.png");
fragment.append(img);

// ...

const result = await session.prompt(fragment);

P.S.: Other considered approaches for enabling multimodal prompts include:

  1. the MediaStream interface could be of use for enabling voice capabilities (mentioned in #40).
  2. the Clipboard interfaces or DataTransfer interfaces could be of use.
    1. capabilities for exchanging media streams could be added to existing capabilities for exchanging data and files.
domenic added a commit that referenced this issue Jan 20, 2025
Closes #40. Somewhat helps with #70.
domenic added a commit that referenced this issue Jan 20, 2025
Closes #40. Somewhat helps with #70.
@domenic domenic added the enhancement New feature or request label Jan 23, 2025
domenic added a commit that referenced this issue Feb 25, 2025
Closes #40. Somewhat helps with #70.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants