ChatGPT

nsaspook

Joined Aug 27, 2009
16,325
https://spectrum.ieee.org/prompt-engineering-is-dead
AI Prompt Engineering Is Dead
Long live AI prompt engineering
“It’s very easy to make a prototype,” Henley says. “It’s very hard to production-ize it.” Prompt engineering seems like a big piece of the puzzle when you’re building a prototype, Henley says, but many other considerations come into play when you’re making a commercial-grade product.

Challenges of making a commercial product include ensuring reliability—for example, failing gracefully when the model goes offline; adapting the model’s output to the appropriate format, since many use cases require outputs other than text; testing to make sure the AI assistant won’t do something harmful in even a small number of cases; and ensuring safety, privacy, and compliance. Testing and compliance are particularly difficult, Henley says, as traditional software-development testing strategies are maladapted for nondeterministic LLMs.

To fulfill these myriad tasks, many large companies are heralding a new job title: Large Language Model Operations, or LLMOps, which includes prompt engineering in its life cycle but also entails all the other tasks needed to deploy the product. Henley says LLMOps’ predecessors, machine learning operations (MLOps) engineers, are best positioned to take on these jobs.
1709849884603.png
 

nsaspook

Joined Aug 27, 2009
16,325
https://www.semafor.com/article/03/...repeats-the-same-ai-blunders-as-google-gemini
Adobe Firefly repeats the same AI blunders as Google Gemini

1710371644623.png
In this newsletter, I’ve shared my somewhat contrarian view on this. I think it shows a technical shortcoming in the way large language models work. And that’s not unique to Google; it’s inherent in the architecture that powers generative AI.

Earlier this week, my Semafor colleague Alan Haburchak pointed out that he was seeing similar results in Adobe Firefly, the image generation service that launched about a year ago.

Adobe lacks Google’s high profile and hasn’t become a political target, but he wasn’t the first person to point out the problem. I found Adobe customers complaining about this issue last May. “I’m creating a comic and one of the characters happens to be an elderly white man,” one customer wrote. “But Firefly insists on giving me a ‘diverse’ mix of images, one black, one hispanic, one asian, one white.”
The Adobe results show how this issue is not exclusive to one company or one type of model. And Adobe has, more than most big tech companies, tried to do everything by the book. It trained its algorithm on stock images, openly licensed content, and public domain content so that its customers could use its tool without worries about copyright infringement.
 

nsaspook

Joined Aug 27, 2009
16,325
https://themarkup.org/news/2024/03/29/nycs-ai-chatbot-tells-businesses-to-break-the-law
NYC’s AI Chatbot Tells Businesses to Break the Law
In October, New York City announced a plan to harness the power of artificial intelligence to improve the business of government. The announcement included a surprising centerpiece: an AI-powered chatbot that would provide New Yorkers with information on starting and operating a business in the city.

The problem, however, is that the city’s chatbot is telling businesses to break the law.

Five months after launch, it’s clear that while the bot appears authoritative, the information it provides on housing policy, worker rights, and rules for entrepreneurs is often incomplete and in worst-case scenarios “dangerously inaccurate,” as one local housing policy expert told The Markup.
...
There’s little reason for visitors to the chatbot page to distrust the service. Users who visit today get informed the bot “uses information published by the NYC Department of Small Business Services” and is “trained to provide you official NYC Business information.” One small note on the page says that it “may occasionally produce incorrect, harmful or biased content,” but there’s no way for an average user to know whether what they’re reading is false. A sentence also suggests users verify answers with links provided by the chatbot, although in practice it often provides answers without any links. A pop-up notice encourages visitors to report any inaccuracies through a feedback form, which also asks them to rate their experience from one to five stars.
 

WBahn

Joined Mar 31, 2012
32,844
It never ceases to amaze me how naive (or many other terms that could be inserted here) large organizations (whether it be governments or companies or other) can be. You would think that, with as many people who had to have a finger in this pie, that at least one of them would have an awareness of how stupid the very notion is of trying to do something like this. Imagine that I had a tax preparation company and I gave out blatantly wrong information on basic tax questions to my customers. Would it matter that in the fine print of the contract the customer signed I said that any information given to them could be completely wrong and that it was their responsibility to verify everything I told them. Of course not -- nor should it. I am representing myself as someone that is a professional in that field and people that come to me have a reasonable expectation that the information I give them is accurate and correct unless I disclose specifically that a certain piece of information is not. Would the fact that those answers where generated by some chatbot absolve me of responsibility in the eyes of NYC regulators and prosecutors? Of course not -- nor should it. Yet they will assert that because they have a disclaimer on the page, that that means they can't be held accountable.
 

nsaspook

Joined Aug 27, 2009
16,325
It never ceases to amaze me how naive (or many other terms that could be inserted here) large organizations (whether it be governments or companies or other) can be. You would think that, with as many people who had to have a finger in this pie, that at least one of them would have an awareness of how stupid the very notion is of trying to do something like this. Imagine that I had a tax preparation company and I gave out blatantly wrong information on basic tax questions to my customers. Would it matter that in the fine print of the contract the customer signed I said that any information given to them could be completely wrong and that it was their responsibility to verify everything I told them. Of course not -- nor should it. I am representing myself as someone that is a professional in that field and people that come to me have a reasonable expectation that the information I give them is accurate and correct unless I disclose specifically that a certain piece of information is not. Would the fact that those answers where generated by some chatbot absolve me of responsibility in the eyes of NYC regulators and prosecutors? Of course not -- nor should it. Yet they will assert that because they have a disclaimer on the page, that that means they can't be held accountable.
I'm sure you can use that disclaimer for a free 'get out of jail' card. o_O
 

nsaspook

Joined Aug 27, 2009
16,325
https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/
AI hallucinates software packages and devs download them – even if potentially poisoned with malware
Several big businesses have published source code that incorporates a software package previously hallucinated by generative AI.
...
According to Bar Lanyado, security researcher at Lasso Security, one of the businesses fooled by AI into incorporating the package is Alibaba, which at the time of writing still includes a pip command to download the Python package huggingface-cli in its GraphTranslator installation instructions.

There is a legit huggingface-cli, installed using pip install -U "huggingface_hub[cli]".


But the huggingface-cli distributed via the Python Package Index (PyPI) and required by Alibaba's GraphTranslator – installed using pip install huggingface-cli – is fake, imagined by AI and turned real by Lanyado as an experiment.

He created huggingface-cli in December after seeing it repeatedly hallucinated by generative AI; by February this year, Alibaba was referring to it in GraphTranslator's README instructions rather than the real Hugging Face CLI tool.
 

WBahn

Joined Mar 31, 2012
32,844
https://www.theregister.com/2024/03/28/ai_bots_hallucinate_software_packages/
AI hallucinates software packages and devs download them – even if potentially poisoned with malware
As with so many security exploits (across the board, not just computer-related), this one comes down to basic social engineering -- identify vulnerabilities due to human laziness and then leverage them, knowing that enough of those same lazy humans will continue to be too lazy to exercise even the most basic precautions, to make it worthwhile.
 

nsaspook

Joined Aug 27, 2009
16,325
https://techcrunch.com/2024/04/02/anthropic-researchers-wear-down-ai-ethics-with-repeated-questions/
Anthropic researchers wear down AI ethics with repeated questions
How do you get an AI to answer a question it’s not supposed to? There are many such “jailbreak” techniques, and Anthropic researchers just found a new one, in which a large language model (LLM) can be convinced to tell you how to build a bomb if you prime it with a few dozen less-harmful questions first.

They call the approach “many-shot jailbreaking” and have both written a paper about it and also informed their peers in the AI community about it so it can be mitigated.

The vulnerability is a new one, resulting from the increased “context window” of the latest generation of LLMs. This is the amount of data they can hold in what you might call short-term memory, once only a few sentences but now thousands of words and even entire books.


What Anthropic’s researchers found was that these models with large context windows tend to perform better on many tasks if there are lots of examples of that task within the prompt. So if there are lots of trivia questions in the prompt (or priming document, like a big list of trivia that the model has in context), the answers actually get better over time. So a fact that it might have gotten wrong if it was the first question, it may get right if it’s the hundredth question.

But in an unexpected extension of this “in-context learning,” as it’s called, the models also get “better” at replying to inappropriate questions. So if you ask it to build a bomb right away, it will refuse. But if you ask it to answer 99 other questions of lesser harmfulness and then ask it to build a bomb … it’s a lot more likely to comply.
 

nsaspook

Joined Aug 27, 2009
16,325
https://cacm.acm.org/opinion/generative-ai-and-cs-education/
Generative AI and CS Education
Increased knowledge sharing is helping CS educators and researchers accelerate change in computing education.

I was in grade school in the pre-calculator era. When calculators/computers became cheap and usable we still had the "number sense" to use them as tools (as working agents of our knowledge) instead of a crutch. I'm watching a daughter take college level programming courses in C++ and now C (using K&R as the course book :D ) for her computer architectures classes while taking calculus 3 and physics. So far Generative AI has only been lightly touched during her learning process so they can develop the 'code sense' of a traditional CS education. Integrating AI assistants into education will IMO be totally different than the use of calculator as unless there is a calculation error, they don't hallucinate answers. The "AI" gives you an answer (without understanding) and tries hard to convince you it's correct (at face value). You can only tell whether it's a good answer or not if you're capable of writing the good answer yourself.

How to teach the 'code sense' to see and detect these hallucinated and sometimes detailed, at times, very complicated code responses from autocomplete chatbots will be an interesting process. These tools deliver code but writing code has very little to do with "computer science".

IMO current LLMs are not early prototypes, they are pretty much at the limits of that's capable with that technology. The reliability and trust problems we see today likely can't and won't be fixed long-term with LLM based programming.
 
Last edited:

nsaspook

Joined Aug 27, 2009
16,325
https://www.cnn.com/2024/04/06/tech/teachers-grading-ai/index.html
Teachers are using AI to grade essays. But some experts are raising ethical concerns

When Diane Gayeski, a professor of strategic communications at Ithaca College, receives an essay from one of her students, she runs part of it through ChatGPT, asking the AI tool to critique and suggest how to improve the work.

“The best way to look at AI for grading is as a teaching assistant or research assistant who might do a first pass … and it does a pretty good job at that,” she told CNN.

She shows her students the feedback from ChatGPT and how the tool rewrote their essay. “I’ll share what I think about their intro, too, and we’ll talk about it,” she said.

Gayeski requires her class of 15 students to do the same: run their draft through ChatGPT to see where they can make improvements.
But while some schools have formed policies on how students can or can’t use AI for schoolwork, many do not have guidelines for teachers. The practice of using AI for writing feedback or grading assignments also raises ethical considerations. And parents and students who are already spending hundreds of thousands of dollars on tuition may wonder if an endless feedback loop of AI-generated and AI-graded content in college is worth the time and money.
...
She also sees uploading a student’s work to ChatGPT as a “huge ethical consideration” and potentially a breach of their intellectual property. AI tools like ChatGPT use such entries to train their algorithms on everything from patterns of speech to how to make sentences to facts and figures.

Ethics professor Leidner agreed, saying this should particularly be avoided for doctoral dissertations and master’s theses because the student might hope to publish the work.
 

nsaspook

Joined Aug 27, 2009
16,325
https://www.bloomberg.com/news/newsletters/2024-04-12/ai-products-still-need-their-human-helpers
AI Products Still Rely on Humans to Fill the Performance Gaps
People build AI to mimic human intelligence and capabilities. But when the AI can’t quite deliver on the promise, we end up with humans pretending to be chatbots pretending to be humans.

It’s the latest iteration of a trick that stretches back at least as far as 1770, when the original Mechanical Turk machine appeared to play chess automatically — but actually concealed a human chessmaster inside its apparatus.
...
As long as there’s the incentive to overhype AI’s abilities, there will be gaps between what AI promises and what it can reliably do. To fill that gap, you can always hire a person.
 

nsaspook

Joined Aug 27, 2009
16,325
https://sherwood.news/tech/meta-wont-tell-you-what-went-into-training-its-new-ai-model-llama-3/
Meta’s not telling where it got its AI training data
It did mention that it includes AI-generated data, or synthetic data: “we used Llama 2 to generate the training data for the text-quality classifiers that are powering Llama 3.” There are plenty of known issues with synthetic or AI-created data, foremost of which is that it can exacerbate existing issues with AI, because it’s liable to spit out a more concentrated version of any garbage it is ingesting.
1713549101728.png
1713549147533.png
 
Top