This Week in the World of Artificial Intelligence
Here are a few things I learned this week about the fast moving field of artificial intelligence.
Lawyers are wakening up and entering unchartered legal waters. Ars Technica had a nice article on the imminent legal earthquake for AI.
“Stability AI has copied more than 12 million photographs from Getty Images’ collection, along with the associated captions and metadata, without permission from or compensation to Getty Images,” Getty wrote in its lawsuit. Legal experts tell me that these are uncharted legal waters.
The larger concern, Sag told me, is that these infringing outputs could “make the whole fair use defense unravel.” The core question in fair use analysis is whether a new product acts as a substitute for the product being copied or whether it “transforms” the old product into something new and distinctive. In the Google Books case, for example, the courts had no trouble finding that a book search engine was a new, transformative product that didn’t in any way compete with the books it was indexing. Google wasn’t making new books. Stable Diffusion is creating new images. And while Google could guarantee that its search engine would never display more than three lines of text from any page in a book. Stability AI can’t make a similar promise. On the contrary, we know that Stable Diffusion occasionally generates near-perfect copies of images from its training data.
ChatGPT lies. We’ve heard it or seen it in action. There is a lot of focus on the quality of the results. Using Github Copilot to solve Simple Programming Problems evaluates limitations of OpenAI’s Codex natural language model and of CoPilot in an education setting.
https://github.blog/2023-03-22-github-copilot-x-the-ai-powered-developer-experience/ shows how advances to aid developers are coming fast and furious. Key will be to understand how these work, while protecting your company’s IP. It is one thing to have this work for you on open source projects. It is another to have this run on your proprietary code. Some clarity around this will be important.
I also heard about GPT4all, a chatbot trained on a massive collection of clean assistant data including code, stories and dialogue](https://github.com/nomic-ai/gpt4all. The GPT4All Technical Report is still a bit gibberish to me currently.
Steven Wolfram getting giddy about AI validates this is something to pay attention to.
People have been wondering what Meta has been up to. Last week, they released SAM - that’s not a bot. Meta released the Segment Anything Model (SAM), which aims to revolutionize image segmentation. That seems to be more a building block for other systems.
I closed out the week reading Stanford’s AI Index report.