Utility Skills

General-purpose helper skills for everyday tasks: document processing (PDF, XLSX), memory management, screen capture, passwords, weather, text-to-speech, video downloading, and more. These skills handle the miscellaneous but essential operations that do not fit neatly into other categories.

Overview

Utility skills cover a broad range of day-to-day operations. Some are built-in skills loaded directly by the harness, while others are implemented as MCP servers or external tool integrations. Five of these skills ship as active (enabled by default); the remaining seven are available but require configuration or external dependencies.

Skill	Status	Type	Description
`memory`	Active	Built-in	Persistent cross-session memory with auto-save and recall
`pdf`	Active	Built-in	Read, parse, extract text, and summarize PDF documents
`pdf-vision`	Active	Built-in	Vision-based PDF reading for scanned documents and images
`screen-capture`	Active	Built-in	Capture screenshots of the current screen or specific regions
`xlsx`	Active	Built-in	Read, write, and manipulate Excel spreadsheets
`1password`	Available	MCP	Look up credentials and secrets from 1Password vaults
`goplaces`	Available	MCP	Location search, directions, and place information
`summarize`	Available	Built-in	Summarize long text, articles, or transcripts
`tts`	Available	External	Text-to-speech conversion for audio output
`video-downloader`	Available	External	Download videos from URLs (YouTube, etc.)
`video-frames`	Available	External	Extract frames from video files for analysis
`weather`	Available	MCP	Current weather and forecasts by location

memory

Persistent memory that survives across sessions. The memory skill automatically saves important context -- decisions, preferences, project notes, feedback -- and recalls them when relevant. Memory is stored in Markdown files under ~/.cli-jaw/memory/ and indexed in SQLite for fast retrieval.

Property	Value
Skill ID	`memory`
Category	Utility
Status	Active (built-in)
Storage	`~/.cli-jaw/memory/` (Markdown + SQLite index)
Trigger	Remember, recall, note, preference, feedback

Capabilities

Action	Description
Auto-save	Automatically detects and persists important context from conversations
Recall	Retrieves relevant memories based on semantic similarity to the current conversation
Manual save	Explicitly save a note or decision via `/memory save`
Search	Full-text and semantic search across all saved memories
List	Browse all memory files organized by topic
Delete	Remove specific memory entries

Example Usage

# Manually save a note
/memory save "JWT tokens expire after 24h in this project"

# Search memories
/memory search "authentication"

# List all memory files
/memory list

# Delete a specific memory
/memory delete shared/old-decision.md

Natural language triggers:
"이거 기억해줘" -- saves the current context to memory
"지난번에 뭐라고 했었지?" -- recalls relevant past conversation context
"내가 선호하는 설정 뭐였어?" -- retrieves saved preferences
"피드백 메모해둬" -- saves feedback for future reference

Memory Structure

Memories are organized into namespaced Markdown files. The auto-memory system writes to MEMORY.md as an index, with detailed entries in topic-specific files.

~/.cli-jaw/memory/
  MEMORY.md                    # Index with links to detailed files
  feedback_goal_done_method.md # Specific feedback entry
  feedback_poll_dont_wait.md   # Another feedback entry
  shared/
    decisions.md               # Shared project decisions
    preferences.md             # User preferences

pdf

Text-based PDF processing. Reads PDF files, extracts text content, parses structure (headings, tables, lists), and supports summarization. Works best with digitally-created PDFs that contain selectable text.

Property	Value
Skill ID	`pdf`
Category	Utility
Status	Active (built-in)
Input	Local file path or URL
Best for	Digital PDFs with selectable text

Capabilities

Action	Description
Read	Extract full text content from a PDF file
Page range	Read specific pages (e.g., pages 1-5 of a large document)
Summarize	Generate a concise summary of the PDF content
Search	Find specific text or patterns within the document
Metadata	Extract title, author, page count, and other PDF metadata

Example Usage

# Read a PDF file
/pdf read ~/Documents/report.pdf

# Read specific pages
/pdf read ~/Documents/report.pdf --pages 1-5

# Summarize a PDF
/pdf summarize ~/Documents/whitepaper.pdf

# Extract metadata
/pdf info ~/Documents/contract.pdf

Natural language triggers:
"이 PDF 요약해줘" -- summarizes the PDF document
"이 문서에서 계약 조건 찾아줘" -- searches for contract terms in the PDF
"PDF 몇 페이지야?" -- returns page count and metadata
"3페이지부터 7페이지까지 읽어줘" -- reads the specified page range

pdf-vision

Vision-based PDF reading. Uses a multimodal model to interpret scanned documents, image-heavy PDFs, and pages where text extraction fails. Each page is rendered as an image and analyzed visually, making this skill essential for handwritten notes, scanned receipts, and complex layouts.

Property	Value
Skill ID	`pdf-vision`
Category	Utility
Status	Active (built-in)
Model	Vision-capable LLM (Claude, GPT-4V)
Best for	Scanned docs, handwritten text, image-heavy PDFs

When to Use pdf-vision vs pdf

Scenario	Recommended	Reason
Digital PDF with selectable text	`pdf`	Faster, more accurate text extraction
Scanned document (image-only)	`pdf-vision`	No selectable text to extract
PDF with charts and diagrams	`pdf-vision`	Can describe visual elements
Handwritten notes	`pdf-vision`	OCR via vision model
Mixed text + images	`pdf-vision`	Captures both textual and visual content
Large document (>50 pages)	`pdf`	Lower latency and token cost

Example Usage

# Read a scanned PDF with vision
/pdf-vision read ~/Documents/scanned-receipt.pdf

# Analyze specific pages visually
/pdf-vision read ~/Documents/diagram-heavy.pdf --pages 3-5

# Extract text from a handwritten note
/pdf-vision read ~/Documents/handwritten.pdf

Natural language triggers:
"이 스캔 문서 읽어줘" -- reads a scanned PDF using vision
"이 영수증 내용 정리해줘" -- extracts and organizes receipt contents
"이 차트 설명해줘" -- describes charts and diagrams in the PDF
"손글씨 읽어줘" -- performs OCR on handwritten content

screen-capture

Captures screenshots of the entire screen, specific windows, or defined regions. The captured images can be analyzed by vision models, saved to disk, or used as input for other skills like vision-click.

Property	Value
Skill ID	`screen-capture`
Category	Utility
Status	Active (built-in)
Output	PNG image (file or inline)
Platform	macOS (screencapture), Linux (scrot/gnome-screenshot)

Capabilities

Action	Description
Full screen	Capture the entire display
Window	Capture a specific application window
Region	Capture a rectangular region by coordinates
Analyze	Capture and immediately describe the contents using a vision model

Example Usage

# Capture full screen
/screen-capture

# Capture and analyze the screen
/screen-capture --analyze

# Save to a specific path
/screen-capture --output ~/Desktop/screenshot.png

# Capture a specific region (x, y, width, height)
/screen-capture --region 0,0,800,600

Natural language triggers:
"스크린샷 찍어줘" -- captures the current screen
"지금 화면 보여줘" -- captures and displays the current screen
"이 화면에 뭐가 보여?" -- captures and analyzes the screen contents
"화면 캡처해서 저장해줘" -- captures and saves to a file

xlsx

Excel spreadsheet processing. Reads, writes, and manipulates .xlsx files -- extracting data from sheets, creating new workbooks, updating cells, and converting between formats (CSV, JSON). Supports formulas, multiple sheets, and basic formatting.

Property	Value
Skill ID	`xlsx`
Category	Utility
Status	Active (built-in)
Formats	`.xlsx`, `.xls`, `.csv` (read/write)
Library	SheetJS (xlsx)

Capabilities

Action	Description
Read	Extract data from spreadsheet cells, rows, and sheets
Write	Create new workbooks or update existing ones
Convert	Convert between XLSX, CSV, and JSON formats
Analyze	Summarize data, compute statistics, describe sheet structure
Filter	Extract rows matching specific criteria

Example Usage

# Read an Excel file
/xlsx read ~/Documents/sales-data.xlsx

# Read a specific sheet
/xlsx read ~/Documents/report.xlsx --sheet "Q4 Summary"

# Convert XLSX to CSV
/xlsx convert ~/Documents/data.xlsx --to csv

# Create a new spreadsheet from data
/xlsx write ~/Documents/output.xlsx --data '[{"name":"Alice","score":95},{"name":"Bob","score":87}]'

Natural language triggers:
"이 엑셀 파일 읽어줘" -- reads and displays spreadsheet contents
"이 데이터 엑셀로 만들어줘" -- creates a new XLSX file from data
"CSV로 변환해줘" -- converts the spreadsheet to CSV format
"매출 데이터 요약해줘" -- analyzes and summarizes the spreadsheet data

1password

Secure credential lookup from 1Password vaults. The 1password skill uses the 1Password CLI (op) to search and retrieve items -- logins, passwords, secure notes, API keys -- without exposing them in plaintext to the conversation. Credentials are injected directly into commands or environment variables.

Property	Value
Skill ID	`1password`
Category	Utility
Status	Available (requires 1Password CLI)
Prerequisite	`op` CLI installed and signed in
Auth	Biometric or master password via `op`

Capabilities

Action	Description
Search	Find items by name, tag, or vault
Get	Retrieve a specific field (password, username, OTP)
Inject	Inject a secret into a command via `op run`
List vaults	Show available vaults and their item counts

Example Usage

# Search for a credential
/1password search "GitHub token"

# Get a specific password
/1password get "AWS Production" --field password

# Inject a secret into a command
op run --env-file=.env -- jaw serve

# List all vaults
/1password vaults

Natural language triggers:
"비밀번호 찾아줘" -- searches for a matching credential in 1Password
"GitHub 토큰 가져와줘" -- retrieves the GitHub token from 1Password
"AWS 키 환경변수에 넣어줘" -- injects AWS credentials into the environment
"내 비밀번호 뭐였지?" -- searches vaults for the matching item

Security: The skill never logs or displays passwords in the conversation. Credentials are retrieved via the op CLI and passed through secure references (op://vault/item/field). Ensure your 1Password CLI session is authenticated before use.

goplaces

Location-based search and navigation. Finds places, provides directions, and returns information about businesses, landmarks, and addresses. Powered by mapping APIs through an MCP server.

Property	Value
Skill ID	`goplaces`
Category	Utility
Status	Available (MCP server)
API	Google Maps / Apple Maps
Prerequisite	Maps API key configured

Capabilities

Action	Description
Search places	Find restaurants, shops, landmarks by query
Directions	Get route and travel time between two locations
Place details	Hours, rating, phone number, address for a place
Nearby	Find places near a given location or current position

Example Usage

# Search for a place
/goplaces search "coffee shops near Gangnam Station"

# Get directions
/goplaces directions "Seoul Station" "Gangnam Station"

# Get details about a specific place
/goplaces details "Starbucks Gangnam"

# Find nearby restaurants
/goplaces nearby --type restaurant --radius 500m

Natural language triggers:
"근처 카페 찾아줘" -- searches for nearby coffee shops
"강남역에서 서울역까지 어떻게 가?" -- provides directions between locations
"이 가게 영업시간 알려줘" -- retrieves business hours for a place
"주변 맛집 추천해줘" -- finds highly-rated restaurants nearby

summarize

Condenses long text into concise summaries. Works with articles, transcripts, meeting notes, documentation, and any lengthy text input. Supports multiple summary styles -- bullet points, paragraphs, executive summaries, and key takeaways.

Property	Value
Skill ID	`summarize`
Category	Utility
Status	Available
Input	Text, file path, or URL
Styles	Bullets, paragraph, executive, key-takeaways

Example Usage

# Summarize a file
/summarize ~/Documents/meeting-notes.md

# Summarize with a specific style
/summarize ~/Documents/report.pdf --style bullets

# Summarize clipboard content
/summarize --clipboard

# Summarize a web article
/summarize https://example.com/long-article

Natural language triggers:
"이거 요약해줘" -- summarizes the provided content
"회의록 정리해줘" -- summarizes meeting notes into key points
"이 기사 핵심만 알려줘" -- extracts key takeaways from an article
"3줄로 요약해줘" -- produces a concise three-line summary

tts

Text-to-speech conversion. Converts text input into spoken audio using TTS engines (system voices or cloud APIs). Useful for reading documents aloud, generating audio files from text, and accessibility.

Property	Value
Skill ID	`tts`
Category	Utility
Status	Available (external)
Engines	macOS `say`, OpenAI TTS, ElevenLabs
Output	Direct playback or audio file (MP3, WAV)

Example Usage

# Speak text aloud
/tts "Hello, this is a test of the text-to-speech system."

# Save to file
/tts "Meeting summary follows..." --output ~/Desktop/summary.mp3

# Use a specific voice
/tts "안녕하세요" --voice korean --engine openai

# Read a file aloud
/tts --file ~/Documents/notes.md

Natural language triggers:
"이거 읽어줘" -- reads the content aloud using TTS
"소리로 들려줘" -- converts text to speech and plays it
"이 메모 음성 파일로 만들어줘" -- generates an audio file from text
"영어로 발음해줘" -- speaks the text in English pronunciation

video-downloader

Downloads videos from URLs using yt-dlp. Supports YouTube, Vimeo, and hundreds of other video platforms. Can select quality, extract audio only, download subtitles, and save to specified directories.

Property	Value
Skill ID	`video-downloader`
Category	Utility
Status	Available (external)
Prerequisite	`yt-dlp` installed
Platforms	YouTube, Vimeo, Twitter, and 1000+ sites

Capabilities

Action	Description
Download	Download video at best quality or specified resolution
Audio only	Extract and save only the audio track (MP3, M4A)
Subtitles	Download subtitles in specified language
Info	Show video metadata without downloading
Playlist	Download all videos in a playlist

Example Usage

# Download a video
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ

# Download audio only
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --audio

# Download with specific quality
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --quality 720p

# Download subtitles
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --subs ko

Natural language triggers:
"이 영상 다운받아줘" -- downloads the video from the given URL
"음성만 추출해줘" -- extracts audio track from the video
"자막 다운받아줘" -- downloads subtitles for the video
"이 유튜브 영상 저장해줘" -- downloads and saves the YouTube video

video-frames

Extracts individual frames from video files for analysis. Captures frames at specified intervals or timestamps, outputs them as images, and optionally sends them to a vision model for description. Useful for video summarization, content analysis, and thumbnail generation.

Property	Value
Skill ID	`video-frames`
Category	Utility
Status	Available (external)
Prerequisite	`ffmpeg` installed
Output	PNG/JPEG frame images

Example Usage

# Extract one frame per second
/video-frames ~/Videos/demo.mp4 --interval 1s

# Extract frame at a specific timestamp
/video-frames ~/Videos/demo.mp4 --at 01:23:45

# Extract and analyze frames with vision model
/video-frames ~/Videos/demo.mp4 --interval 10s --analyze

# Generate thumbnails for a video
/video-frames ~/Videos/demo.mp4 --count 5 --output ~/Desktop/thumbs/

Natural language triggers:
"이 영상에서 프레임 추출해줘" -- extracts frames from the video
"영상 내용 분석해줘" -- extracts frames and describes the video content
"썸네일 만들어줘" -- generates representative thumbnail images
"1분 30초 장면 캡처해줘" -- captures the frame at the specified timestamp

weather

Current weather conditions and forecasts. Provides temperature, humidity, wind, precipitation, and multi-day forecasts for any location. Implemented as an MCP server that queries weather APIs.

Property	Value
Skill ID	`weather`
Category	Utility
Status	Available (MCP server)
API	OpenWeatherMap / WeatherAPI
Data	Current conditions, hourly, daily forecast

Capabilities

Action	Description
Current	Current temperature, humidity, wind, conditions
Forecast	Multi-day forecast with highs, lows, precipitation
Hourly	Hour-by-hour breakdown for the next 24-48 hours
Alerts	Severe weather alerts and warnings for the area

Example Usage

# Get current weather
/weather Seoul

# Get a 5-day forecast
/weather forecast "San Francisco" --days 5

# Get hourly forecast
/weather hourly Tokyo --hours 24

# Check weather alerts
/weather alerts "New York"

Natural language triggers:
"날씨 알려줘" -- shows current weather for the default location
"서울 날씨 어때?" -- shows current weather in Seoul
"내일 비 와?" -- checks tomorrow's precipitation forecast
"이번 주 날씨 예보 알려줘" -- provides the weekly weather forecast
"우산 가져가야 해?" -- checks rain probability and advises accordingly

Configuration

{
  "mcpServers": {
    "weather": {
      "command": "npx",
      "args": ["-y", "weather-mcp-server"],
      "env": {
        "WEATHER_API_KEY": "your-api-key",
        "WEATHER_DEFAULT_LOCATION": "Seoul"
      }
    }
  }
}

Skill Comparison

Quick reference for choosing the right utility skill for your task.

Document Processing

Task	Skill	Notes
Read a digital PDF	`pdf`	Fast text extraction, supports page ranges
Read a scanned PDF	`pdf-vision`	Vision-based OCR, slower but handles images
Read/write Excel	`xlsx`	Full XLSX/CSV/JSON support
Summarize any text	`summarize`	Works with files, URLs, and clipboard

Media and Capture

Task	Skill	Notes
Screenshot	`screen-capture`	Full screen, window, or region
Download video	`video-downloader`	Requires `yt-dlp`
Extract video frames	`video-frames`	Requires `ffmpeg`
Text to speech	`tts`	System or cloud voices

Information and Secrets

Task	Skill	Notes
Look up credentials	`1password`	Requires `op` CLI
Find places / directions	`goplaces`	Requires Maps API key
Check weather	`weather`	Requires Weather API key
Remember context	`memory`	Always active, auto-saves

Common Patterns

Document Pipeline

Combine PDF and summarization skills to process documents end-to-end.

# Read a PDF, summarize it, and save the summary
"이 PDF 읽고 요약해서 엑셀로 정리해줘"

# The agent chains:
# 1. pdf → read the document
# 2. summarize → condense key points
# 3. xlsx → write summary to a spreadsheet

Research Workflow

Use weather, location, and memory skills together for planning tasks.

# Plan a trip
"이번 주말 부산 날씨 알려주고, 근처 맛집 찾아줘. 결과 기억해둬."

# The agent chains:
# 1. weather → forecast for Busan this weekend
# 2. goplaces → search restaurants near Busan
# 3. memory → save results for later reference

Secure Development

Pull credentials from 1Password and inject them into your dev environment.

# Set up dev environment with secure credentials
"1Password에서 DB 비밀번호 찾아서 .env에 넣어줘"

# The agent chains:
# 1. 1password → get "Database Production" password
# 2. inject into .env file via op:// reference

Note: When chaining skills, the agent determines the optimal execution order automatically. You do not need to specify individual skill invocations -- just describe what you want in natural language.

Smart Home Session Commands