LangSmith CLI & Skills

LangSmith CLI & Skills

2 min read

We’re releasing a CLI along with our first set of skills to give AI coding agents expertise in the LangSmith ecosystem. This includes adding tracing to agents, understanding their execution, building test sets, and evaluating performance. On our eval set, this bumps Claude Code’s performance on these tasks from 17% to 92%.

The LangSmith CLI

At the core is our new LangSmith CLI. The LangSmith CLI is designed to be agent-native: it gives coding agents (and developers) the building blocks needed to do anything within LangSmith. This includes fetching traces, curating datasets, and running experiments. When combined with the guidance in skills, coding agents gain the ability to fluently navigate LangSmith completely through the terminal. We believe that enabling this is critical to the future of agent development, as we expect agent improvement loops to increasingly be driven by other agents that are terminal-first.

You can install the CLI with the following installation script:

curl -sSL https://raw.githubusercontent.com/langchain-ai/langsmith-cli/main/scripts/install.sh | sh

What are Skills?

Skills are curated instructions, scripts, and resources that improve coding agent performance in specialized domains. Importantly, skills are dynamically loaded through progressive disclosure — the agent only retrieves a skill when its relevant to the task at hand. This enhances agent capabilities, as historically, giving too many tools to an agent would cause its performance to degrade.

Skills are portable and shareable — they consist of markdown files and scripts that can be retrieved on demand. We’re sharing a set of LangSmith skills that can be ported to any coding agent that supports skill functionality.

LangSmith Skills

Within the langsmith-skills repo, we maintain a set of 3 skills:

  • trace: add tracing to existing code, and query traces
  • dataset: build up datasets of examples
  • evaluator: evaluate agents over those datasets

These three areas represent the three core areas of LangSmith AI engineering. We will add to this set of skills over time.

Skill Impacts

Using skills, we saw significant improvements in Claude Code’s performance on basic LangSmith tasks.

TestModelPass Rate
Claude Code without SkillsSonnet 4.617%
Claude Code with SkillsSonnet 4.692%
Pass rate was calculated using LangSmith evaluations. We plan to open source the testing benchmark we used

These skills enable coding agents to create a virtuous cycle in agent development. Your coding agent can use LangChain and LangSmith skills to:

  1. Add tracing logic to your agent
  2. Generate traces with the agent and use them to effectively debug behavior
  3. Use generated traces to create a systematic testing dataset
  4. Create evaluators to run on the dataset and validate agent correctness
  5. Iterate further on the agent architecture based on evaluations and human feedback

This loop is a powerful tool to accelerate agent development. To see it in action, see our demo of the skills:

Installation

You can install these skills using npx skills:

Local (current project):

npx skills add langchain-ai/langsmith-skills --skill '*' --yes

Global (all projects):

npx skills add langchain-ai/langsmith-skills --skill '*' --yes --global

To link skills to a specific agent (e.g. Claude Code):

npx skills add langchain-ai/langsmith-skills --agent claude-code --skill '*' --yes --global

Conclusion

We’re excited for the community to use LangChain and LangSmith to improve your experience building with our ecosystem. We plan to continue adding skills content as new capabilities are added to LangSmith. In parallel, we are also releasing a set of skills for interacting with LangChain's open source libraries (LangChain, LangGraph and DeepAgents). If you have ideas for additional skills or improvements, we'd love to hear from you!