Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Register for free to join our community of investors and share your ideas. You will also get access to streaming quotes, interactive charts, trades, portfolio, live options flow and more tools.
Could 744/629 be used to chain these LLM’s together? Or is it more of 521 or both?
This kinda stuck out.
Looking forward, the future of generative AI lies in creatively chaining all sorts of LLMs and knowledge bases together to create new kinds of assistants that deliver authoritative results users can verify.
Mahalo Doc . very informative Do you see some 744 in there?
Augmented Generation, aka RAG?
Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
November 15, 2023 by Rick Merritt
Share
To understand the latest advance in generative AI, imagine a courtroom.
Judges hear and decide cases based on their general understanding of the law. Sometimes a case — like a malpractice suit or a labor dispute — requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.
Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.
The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.
How It Got Named ‘RAG’
Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.
Picture of Patrick Lewis, lead author of RAG paper
Patrick Lewis
“We definitely would have put more thought into the name had we known our work would become so widespread,” Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.
“We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea,” said Lewis, who now leads a RAG team at AI startup Cohere.
So, What Is Retrieval-Augmented Generation (RAG)?
Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM’s parameters essentially represent the general patterns of how humans use words to form sentences.
That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.
Combining Internal, External Resources
Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.
The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG “a general-purpose fine-tuning recipe” because it can be used by nearly any LLM to connect with practically any external resource.
Building User Trust
Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.
What’s more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.
Another great advantage of RAG is it’s relatively easy. A blog by Lewis and three of the paper’s coauthors said developers can implement the process with as few as five lines of code.
That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.
How People Are Using RAG
With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.
For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.
In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.
The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.
Getting Started With Retrieval-Augmented Generation
To help users get started, NVIDIA developed an AI workflow for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.
The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.
The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.
Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal — it can deliver a 150x speedup over using a CPU.
Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.
RAG doesn’t require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.
Chart shows running RAG on a PC
An example application for RAG on a PC.
PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source – whether that be emails, notes or articles – to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.
A recent blog provides an example of RAG accelerated by TensorRT-LLM for Windows to get better results fast.
The History of RAG
The roots of the technique go back at least to the early 1970s. That’s when researchers in information retrieval prototyped what they called question-answering systems, apps that use natural language processing (NLP) to access text, initially in narrow topics such as baseball.
The concepts behind this kind of text mining have remained fairly constant over the years. But the machine learning engines driving them have grown significantly, increasing their usefulness and popularity.
In the mid-1990s, the Ask Jeeves service, now Ask.com, popularized question answering with its mascot of a well-dressed valet. IBM’s Watson became a TV celebrity in 2011 when it handily beat two human champions on the Jeopardy! game show.
Picture of Ask Jeeves, an early RAG-like web service
Today, LLMs are taking question-answering systems to a whole new level.
Insights From a London Lab
The seminal 2020 paper arrived as Lewis was pursuing a doctorate in NLP at University College London and working for Meta at a new London AI lab. The team was searching for ways to pack more knowledge into an LLM’s parameters and using a benchmark it developed to measure its progress.
Building on earlier methods and inspired by a paper from Google researchers, the group “had this compelling vision of a trained system that had a retrieval index in the middle of it, so it could learn and generate any text output you wanted,” Lewis recalled.
Picture of IBM Watson winning on "Jeopardy" TV show, popularizing a RAG-like AI service
The IBM Watson question-answering system became a celebrity when it won big on the TV game show Jeopardy!
When Lewis plugged into the work in progress a promising retrieval system from another Meta team, the first results were unexpectedly impressive.
“I showed my supervisor and he said, ‘Whoa, take the win. This sort of thing doesn’t happen very often,’ because these workflows can be hard to set up correctly the first time,” he said.
Lewis also credits major contributions from team members Ethan Perez and Douwe Kiela, then of New York University and Facebook AI Research, respectively.
When complete, the work, which ran on a cluster of NVIDIA GPUs, showed how to make generative AI models more authoritative and trustworthy. It’s since been cited by hundreds of papers that amplified and extended the concepts in what continues to be an active area of research.
How Retrieval-Augmented Generation Works
At a high level, here’s how an NVIDIA technical brief describes the RAG process.
When users ask an LLM a question, the AI model sends the query to another model that converts it into a numeric format so machines can read it. The numeric version of the query is sometimes called an embedding or a vector.
NVIDIA diagram of how RAG works with LLMs
Retrieval-augmented generation combines LLMs with embedding models and vector databases.
The embedding model then compares these numeric values to vectors in a machine-readable index of an available knowledge base. When it finds a match or multiple matches, it retrieves the related data, converts it to human-readable words and passes it back to the LLM.
Finally, the LLM combines the retrieved words and its own response to the query into a final answer it presents to the user, potentially citing sources the embedding model found.
Keeping Sources Current
In the background, the embedding model continuously creates and updates machine-readable indices, sometimes called vector databases, for new and updated knowledge bases as they become available.
Chart of a RAG process described by LangChain
Retrieval-augmented generation combines LLMs with embedding models and vector databases.
Many developers find LangChain, an open-source library, can be particularly useful in chaining together LLMs, embedding models and knowledge bases. NVIDIA uses LangChain in its reference architecture for retrieval-augmented generation.
The LangChain community provides its own description of a RAG process.
Looking forward, the future of generative AI lies in creatively chaining all sorts of LLMs and knowledge bases together to create new kinds of assistants that deliver authoritative results users can verify.
Get a hands on using retrieval-augmented generation with an AI chatbot in this NVIDIA LaunchPad lab.
Explore generative AI sessions and experiences at NVIDIA GTC, the global conference on AI and accelerated computing, running March 18-21 in San Jose, Calif., and online.
Categories: Deep Learning | Explainer | Generative AI
Tags: Artificial Intelligence | Events | Inference | Machine Learning | New GPU Uses | TensorRT | Trustworthy AI
Awful quiet in v land nvda target just upped to 1,400.00 😱 We need some v news. Like nvda licenses v patents🎉🎉🎉
AI stocks are breaking out like I predicted last summer. I think this will be larger than the .com boom. Be careful investing in new Quantum startups. I’m sure there will be many popping up soon.
It sure would be nice to get some news. Seems to me it’s not going to make much difference who knows what now. I think it’s time to let the cat out of the bag. Unless maybe we have other creditors that we don’t want knowing what’s going on??? Wade was a sleeze ball. Who knows what other shady deals he’s made. Oh. Our BoD. Let’s all hope they get this mess cleaned up soon. In the meantime. My money is on AI
V is sleeping. Lookin at ionq. It may go lower. Watching it.
Still lost. 😂
U lost me on that one.
Maybe nvda will be one of the buyers of v? If the gov ever lets this run. ???? I’m tired of waiting. I should have bought nvda last year when I told u guys they were going to hook up 3 quantum computers by this summer. My 5 k would be 10 k now. It was only 500 bucks last year. 😔 Watch how it jumps when they announce all 3 hooked together!💵💰💵💰💵
Key words.
Off loads desktop virus….
VMware Horizon is not a browser itself, but it provides options for connecting to desktops and applications either through its dedicated client or via a web browser using HTML512.
Maybe. I just bought several shares of nvda this morning at 1,034.00 to hedge my bet. I’ll be buying several more as soon as my money clears. All aboard!
Everyone wants to go to heaven. But no one wants to die.
Remember. We’re a “private” co as it stands. If some biggie buys us out. No one may hear anything. Just sayin. And we actually don’t have any say in it? Being private? Or do we?
Yeah that would be nice. But I bet they don’t say a word about us. Something has to give sooner or later. My guess is later. Or maybe never?
Doc. No time limit on Texas Supreme Court to rule. :($&&$())$&)))&@“. I feel like it could be any day now. But who knows? We wait.
Motion estimation
Article Talk
Language
Watch
Edit
In computer vision and image processing, motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion happens in three dimensions (3D) but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.
Key word? Arbitrary. The key word all the biggies were /are worried about. Yep Doc. I think we’re in there. And Len and his men got this. Relax. Visit Volcano National Park. Stop at the Black Sand Beach at Punaluu then the Green Sand Beach at South point. One of only 4 in the world.
Oh! And “ transformation “ is another one to think about and consider.
His share value is frozen at .02 cents. He owes us $ 21,000,000.00. He only has 80 mil shares x .02 = $ 1,600,000.00. Why would we let him off? U crazy! He needs prison time. Period. A lot of us have died waiting. Many. Too many. Throw the book at him and all the sleeze he attracted.
Monday through Thursday. Should be interesting. Wish I was there. But actually. Seattle sucks. Imo
If Apple actually licensed Ploinks 🎉🎉🎉💰💰💰💵💵💵💵. Looks like it is all coming together. This lawsuit against wolman was necessary for a few reasons. All good ones. Len and his men are nobody to mess with. I think we’ll be having lots of good news soon. Just a hunch
The new case against Wolman will be tried in Texas, not NY . Texas will be fair . Justice will prevail. The money we could get back could pay off the IRS and fund the audit. If they’re smart they will settle like Mills and Farias asap
Knowledge is power. Just ask my daughter. Lol Hell YALE! GO BULLDOGS!
The problem I see with SBV, or anyone for that matter, is as soon as your idea/invention is ready for mkt, it has already been trumped by several other breakthroughs. Exciting times for sure. But everything is moving at warp speed now. It will take a ceo on his toes to keep up. I think Len needs to hire Doc to give him a 30 minute morning briefing on what is happening in the Tec world. Without someone like Doc to keep u on your toes. You’re goona be just pissin into the wind. I know he has Luiz , Scott and others, but a designated daily Tec article educator could be very helpful in making major decisions in these warp speed times. Just sayin
Surprisingly, the exchange energy of holes is not only electrically controllable, but strongly anisotropic.
an·i·so·trop·ic
adjectivePHYSICS
(of an object or substance) having a physical property that has a different value when measured in different directions. A simple example is wood, which is stronger along the grain than across it.
(of a property or phenomenon) varying in magnitude according to the direction of measurement.
"electron scattering is anisotropic"
Fast and precise controlled spin-flip
organizations
Experiment opens door for millions of qubits on one chip
Date:
May 6, 2024
Source:
University of Basel
Summary:
Researchers have achieved the first controllable interaction between two hole spin qubits in a conventional silicon transistor. The breakthrough opens up the possibility of integrating millions of these qubits on a single chip using mature manufacturing processes.
Share:
FULL STORY
Researchers from the University of Basel and the NCCR SPIN have achieved the first controllable interaction between two hole spin qubits in a conventional silicon transistor. The breakthrough opens up the possibility of integrating millions of these qubits on a single chip using mature manufacturing processes.
The race to build a practical quantum computer is well underway.
Researchers around the world are working on a huge variety of qubit technologies.
So far, there is no consensus on what type of qubit is most suitable for maximizing the potential of quantum information science.
Qubits are the foundation of a quantum computer: they handle the processing, transfer and storage of data.
To work correctly, they have to both reliably store and rapidly process information.
The basis for rapid information processing is stable and fast interactions between a large number of qubits whose states can be reliably controlled from the outside.
For a quantum computer to be practical, millions of qubits must be accommodated on a single chip.
The most advanced quantum computers today have only a few hundred qubits, meaning they can only perform calculations that are already possible (and often more efficient) on conventional computers..
Electrons and holes
To solve the problem of arranging and linking thousands of qubits, researchers at the University of Basel and the NCCR SPIN rely on a type of qubit that uses the spin (intrinsic angular momentum) of an electron or a hole.
A hole is essentially a missing electron in a semiconductor.
Both holes and electrons possess spin, which can adopt one of two states: up or down, analogous to 0 and 1 in classical bits.
Compared to an electron spin, a hole spin has the advantage that it can be entirely electrically controlled without needing additional components like micromagnets on the chip.
As early as 2022, Basel physicists were able to show that the hole spins in an existing electronic device can be trapped and used as qubits.
These "FinFETs" (fin field-effect transistors) are built into modern smartphones and are produced in widespread industrial processes.
Now, a team led by Dr. Andreas Kuhlmann has succeeded for the first time in achieving a controllable interaction between two qubits within this setup.
Fast and precise controlled spin-flip
A quantum computer needs "quantum gates" to perform calculations.
These represent operations that manipulate the qubits and couple them to each other.
As the researchers report in the journal Nature Physics, they were able to couple two qubits and bring about a controlled flip of one of their spins, depending on the state of the other's spin -- known as a controlled spin-flip.
"Hole spins allow us to create two-qubit gates that are both fast and high-fidelity. This principle now also makes it possible to couple a larger number of qubit pairs," says Kuhlmann.
The coupling of two spin qubits is based on their exchange interaction, which occurs between two indistinguishable particles that interact with each other electrostatically.
Surprisingly, the exchange energy of holes is not only electrically controllable, but strongly anisotropic.
This is a consequence of spin-orbit coupling, which means that the spin state of a hole is influenced by its motion through space.
To describe this observation in a model, experimental and theoretical physicists at the University of Basel and the NCCR SPIN combined forces.
"The anisotropy makes two-qubit gates possible without the usual trade-off between speed and fidelity," Dr. Kuhlmann says in summary.
"Qubits based on hole spins not only leverage the tried-and-tested fabrication of silicon chips, they are also highly scalable and have proven to be fast and robust in experiments." The study underscores that this approach has a strong chance in the race to develop a large-scale quantum computer.
Oh yeah. I almost forgot about our promised Ploinks shares. I wonder if they will still spin that off or was it al b s ? Knowing Wade it was pure unadulterated b s
Half stock. Half cash if we sell to some biggie.
Gotta wonder if nefarious dropped a dime on wolman for a sweet deal? This would make a great movie.
Apple to unveil new improved Siri June 10.
Apple plans to bill the improved Siri as more private than rival AI services because it will process requests on iPhones rather than remotely in data centers.
Me … hmmm a mini server on your phone? Who’d a thunk it Luiz? lol
It looks to me like they are in a world of hurt. The way Pete has laid it all out from year to year makes it very easy to follow the money. Any jury after going over all this a few times will plainly see what went on here. A slam dunk for us imo
$500,000.00 to $750,000.00 with treble damages and around 25,000,000 shares of stock that could be returned? That’s not chump change folks.
I’m sure our BoD and possibly other employees all played a part in laying this all out so hats off to each and everyone of you. Shareholders owe all of you a debt of gratitude. Mahalo to all.
It looks like we have all the documents we need. Paying off the back rent Wade skipped out on got us all the records we needed. It just took hundreds of hours to make a deep dive through it all occurring over several years. Pete did a really good job of laying out years of this b s. What a tangled web they wove. But he sifted through years of records and laid it all out. Awesome job Pete! You are to be commended. Mahalo nui loa! You get the circle Island tour !
These guys all seem like 2 peas in a pod. But it looks like they will all get cooked together. Ya think Farias flipped on them to avoid prosecution? It looks to me like we’ll be getting a lot of money back. Maybe enough to pay up I R S ? This can only help us to get things in order much faster now. The prospect of more money coming in is always a good thing.
In summary, web crawlers do not use browsers but directly communicate with web servers to collect and index content. They play a crucial role in organizing the vast amount of information available online for efficient search engine results. 😊"
Ok. Where’s our dough? I have to believe there are licensing agreements that have been signed. If not, why not?
Could everyone be crawling on our patent? Pun intended.
And now remember back when our spider web crawler patent was approved. It didn’t mean much back then. But now as you point out look how it fits in with AI ! It’s all coming together now. I think the big picture is about to be revealed.
After reading through this again. It sure looks to me like a whole lot of people may be going to jail? Just a hunch. Lol
Cool. Link works. Read and enjoy Doc. Unbelievable. Everything revealed. Even SiteFlash money etc etc I have to read it again. These guys are sooooo screwed imo. Pete our atty did a fabulous job in laying this tangled web out for a jury.
Just read pacer doc doc lol. These guys are in very serious trouble imo. I’ll try to post the link. My post from chat site.
Wow! Way to go Pete! Very well done. All their sins layed bare. These guys are in very serious trouble. Very serious. I’m talkin very possible serious jail time. And a jury trial to boot! Any jury imo will throw the book at them. I’m seeing possibly disbarred, fines and jail time ? Oh my! No wonder we couldn’t get off the ground with anything. But it looks like we will easily win this. Hang in there everyone. Pete and our BoD moved heaven and earth to lay this all out through hundreds of hours of research under less than favorable conditions. I highly commend them and all their Herculean efforts. I don’t think this will hinder us moving forward. I actually think it will help us! Watch.