- Learn Prompting's Newsletter
- Posts
- đ AI That Learns by Watching You?
đ AI That Learns by Watching You?
PLUS: Updates from Perplexity, Replit, ElevenLabs, and Google, must-have tools for developers, and 2 papers on the real impact of promptingâis Prompting dead or thriving?
In this edition of Learn Prompting Weekly:
Table of Contents
1) AI That Learns by Watching You
Weâre heading into the era of Agentic AI, which is finally starting to take shape instead of being more of an abstract idea. Anthropicâs Computer Use led the charge with their AI that can interact with your computer, and gave us a real look at what the future of AI agents could be.
Demonstration of Contextâs Autopilot in action
So, how do we get these agents to work like we do? Thereâs a pretty clear pattern here: companies are starting with tools that watch how you work, learn from it, and then mimic your behavior. These tools pick up on the way you handle basic tasks and learn to replicate them. That data is probably being used to train the first generation of AI agents, setting the stage for versions that are smarter, more accurate, and capable of tackling more complicated workflows.
Here are the latest examples.
Contextâs Autopilot: An AI That Works the Way You Do
Contextâs Autopilot capabilities
Context has launched Autopilot, an AI tool designed to make your work easier by learning how you work. Hereâs how it works:
Observes and Learns: Autopilot watches how you handle tasks like writing emails, organizing files, or solving problems.
Adapts to Your Style: Over time, it understands your preferences and adjusts to how you like to work.
Uses Familiar Tools: It works with tools you already use, like Google Drive, Slack, or SharePoint, so you donât have to learn anything new.
Handles Tasks for You: Autopilot can create documents, reply to emails, or analyze dataâsaving you time and effort.
In short, Autopilot learns from you to become your personalized productivity assistant.
ChatGPT's Mac Desktop App: AI That âSeesâ What You See
How ChatGPT Mac Desktop App interacts with other apps on your computer
The macOS version of ChatGPT is evolving to become a more hands-on helper. Right now, itâs in beta, but it already has some exciting features:
Desktop Awareness: ChatGPT can âseeâ your desktop and interact directly with open apps.
Works With Specific Tools: It integrates with coding tools like VS Code and Xcode, terminal apps like Terminal and iTerm, and even basic text editors.
Observes and Assists: For now, it provides suggestions and answers based on what youâre working on.
Coming in January, OpenAI is reportedly planning to launch its autonomous agent that could perform actions on your computer to handle tasks for you.
Codeium's Windsurf Editor: Coding Suggestions, Before You Ask
Supercomplete in action. This feature inside Windsurf Editor predicts the next user intent, rather than the next text at the cursor position
Codeiumâs Windsurf Editor is an AI-powered coding tool designed to make programming more efficient by learning from how you work. Hereâs what it offers:
Learns Your Workflow: It analyzes your past actions to predict what youâll do next, making suggestions or automating repetitive tasks.
Understands Complex Codebases: Windsurf Editor can study relationships between files, group relevant components, and help you navigate large projects.
Works Without Micromanagement: It doesnât need constant inputâit proactively performs tasks and supports your workflow, acting like an intelligent assistant.
With these capabilities, Windsurf Editor helps developers focus on what matters while handling the repetitive or complex parts of coding.
Other GenAI Market Updates
Perplexity Shopping: Search for products, compare reviews, and find items online with âSnap to Shop.â "Buy with Pro" offers free shipping and handles checkout. Shopify integration is in the works.
Replit Agent: Turn your ideas into appsâshare a concept, and it writes and deploys the code. Examples include health dashboards and campus parking maps.
ElevenLabs Projects: Advanced tools for text-to-audio creation with multi-voice customization, bulk adjustments, and faster playback. Perfect for audiobooks, scripts, and complex narrations.
Google Learn About: A conversational learning tool offering quick articles, sources, and image-based prompts to explore new topics.
Google Workspace AI Generator: Create visuals directly in Google Docs using Gemini-powered AI to enhance your documents.
2) AI for Coders
Weâve just released a step-by-step guide on how to effectively use generative AI chatbots for coding. Itâs packed with actionable tips to help you automate routine tasks.
Whatâs inside:
GenAI Tools for Code
Automated Code Generation
AI-Assisted Code Completion
Debugging with AI
Code Optimization and Refactoring
Writing Automated Tests
Documentation of Your Code with AI
Understanding Code
Learning New Programming Languages or Frameworks
Advanced: AI Agents for Coding
Ethical Considerations
Read the full article and start putting these tips into action today!
New Tools for Developers
Gemini AI models are now accessible via the OpenAI Library and REST API, making it easier to integrate them into tools like OpenAI's APIs for a faster start.
CopilotKit & LangGraph introduced CoAgents which provides everything needed to integrate LangGraph agents into React apps.
Microsoft open-sourced TinyTroupe, a Python library that simulates personas with specific traits and goals.
3) The Latest In Prompting: Is Prompt Engineering Dead?
Is prompt engineering dead? Two new studies researched the impact of prompt formatting and prompting techniques on the performance of large language models (LLMs).
Hereâs what they found:
Prompt Structure Matters: Changing the format of a prompt (e.g., JSON, Markdown) can affect performance by up to 40%.
Model Preferences Are Real: Experimenting with formats that match a specific modelâs strengths (e.g., GPT-3.5 prefers JSON, GPT-4 favors Markdown) can lead to better results.
Small Changes, Big Impact: Minor tweaks in phrasing or formatting can produce drastically different answers, especially for complex or nuanced tasks.
Use Examples for Better Results: Adding examples of what you want (known as few-shot prompting) can make the AI more reliable, particularly in difficult scenarios.
Larger Models Handle Prompts Better: Models like Llama3-70B are more adaptable to different prompt styles and less likely to produce inconsistent or incorrect responses.
Prompt sensitivity of different models. Higher sensitivity means that the subtle variations in how the question was phrased have a significant impact on the quality of the response
The second paper introduced ProSA (PromptSensiScore), a new metric designed to measure how much an LLM's performance is affected by variations in the wording, structure, or format of input prompts.
Tasks where prompting made the biggest difference:
In tasks requiring creativity or complex reasoning, prompt design had the greatest impact:
Creative Tasks: Writing scripts, stories, or creating visualizations, where open-ended prompts often led to highly variable outputs.
Data Visualization: Generating descriptive or graphical summaries of data, where unclear instructions could produce irrelevant or incomplete visualizations.
Algorithmic and Computational Problem Solving: Complex programming tasks requiring detailed reasoning or multi-step logic, where prompt clarity was crucial for success.
Tasks where AI models were most consistent, regardless of the prompt phrasing:
For straightforward, knowledge-based tasks, AI models performed consistently, regardless of how prompts were phrased:
IT Problem Solving
Business Solutions
Multilingual Programming and Debugging
What this means for you
Prompt engineering isnât deadâitâs evolving. To get the most out of LLMs:
Experiment with formats to match the modelâs preferences.
Use few-shot examples to guide the model and improve reliability.
Opt for larger models for complex or creative tasks, as theyâre better at handling prompt variations.
Also, read our thoughts on that topic:
4) Resources from Learn Prompting
Dario Amodei recently highlighted an important reality: as AI models become more powerful, their potential for harm grows alongside their benefits Just as English is becoming the "hottest new coding language", itâs also becoming the tool people use to exploit AI.
Hereâs the surprising partâmany of the winners from HackAPrompt 1.0 had no traditional tech background!
Developers struggle to predict vulnerabilities and enforce safeguards. The solution is AI Red Teamsâexperts who identify and fix weaknesses before they can be exploited.
Here are the two resources you can use to get started in Red Teaming:
1. HackAPrompt 2.0 with up to $500,000 in prizes
Weâre thrilled to announce HackAPrompt 2.0 with up to $500,000 in prizes! This competition brings together people from all backgrounds to tackle one of the most important challenges of our time: making AI safer for everyone.
AI systems are becoming integrated into every part of our daily lives, but theyâre not infallible. Vulnerabilities in these systems can lead to unintended harms, such as bypassing safety guardrails or eliciting dangerous outputs. HackAPrompt 2.0 invites participants to AI Red Team thse cutting-edge models, test their limits, and expose weaknesses across five unique tracks:
Classic Jailbreaking: Breaking AI models to elicit unintended responses.
Agentic Security: Testing AI systems embedded in decision-making processes.
Attacks of the Future: Exploring hypothetical threats we havenât seen yet.
Two Secret Tracks: To be revealed closer to the competition!
Weâll announce more details in the following weeks. Join the waitlist to be notified when we launch!
2. Live Course on Red Teaming
We're teaching a course on AI Red Teaming, led by the CEO of Learn Prompting, Sander Schulhoff, who created HackAPrompt!
CEO of Learn Prompting, the 1st Prompt Engineering guide on the internet
Award-winning, Deep Reinforcement Learning Researcher, who's co-authored papers with OpenAI, ScaleAI, HuggingFace, Microsoft, The Federal Reserve, & Google.
His post-competition paper on HackAPrompt was awarded Best Theme Paper at EMNLP 2023 out of 20,000 submitted papers. It was also cited by OpenAI's Instruction Hierarchy paper and used to make their latest models up to 46% more resistant to prompt injections.
Alongside Sander Schulhoff, notable guest speakers include:
Pliny the Prompter: The most renowned AI Jailbreaker, who has successfully jailbroken every AI model released to dateâincluding OpenAIâs o1, which hasnât even been made available to the public!
Akshat Parikh: Former AI security researcher at a startup backed by OpenAI and DeepMind researchers, Top 21 in JP Morganâs Bug Bounty Hall of Fame, and Top 250 in Googleâs Bug Bounty Hall of Fame.
Joseph Thacker: Principal AI Engineer & security researcher specializing in application security and AI, with over 1,000 vulnerabilities submitted across HackerOne and Bugcrowd.
More announced soon!
You'll learn alongside other students in our cohort including:
CTOs, CISOs, Chief Security Architects, Principal Security Engineers, Technical Leads, Site Reliability Engineers, PhD Researchers
Core contributors to the OWASPÂŽ Foundation Top 10 LLM Risks
Members of the National Institute of Standards and Technology (NIST) AI Safety Consortium
Government leaders shaping AI security policy
Cybersecurity professionals transitioning into AI Red Teaming careers
The course is $900 now but will increase to $1,200 on December 1st, and it starts in just 15 days, on December 5th.
If you have a Learning & Development budget to use before the year ends, this course is a fantastic way to invest in your skillsâand itâs reimbursable by many companies.
See you in the cohort!
Thanks for reading! Weâd love your feedback to make this newsletter even better. Our goal is to share content thatâs valuable and relevant to you. Help us fine-tune future editions by sharing your thoughts on this weekâs emailâitâll only take a moment!
How was this week's email? |