Utility Skills

General-purpose helper skills for everyday tasks: document processing (PDF, XLSX), memory management, screen capture, passwords, weather, text-to-speech, video downloading, and more. These skills handle the miscellaneous but essential operations that do not fit neatly into other categories.

Overview

Utility skills cover a broad range of day-to-day operations. Some are built-in skills loaded directly by the harness, while others are implemented as MCP servers or external tool integrations. Five of these skills ship as active (enabled by default); the remaining seven are available but require configuration or external dependencies.

SkillStatusTypeDescription
memoryActiveBuilt-inPersistent cross-session memory with auto-save and recall
pdfActiveBuilt-inRead, parse, extract text, and summarize PDF documents
pdf-visionActiveBuilt-inVision-based PDF reading for scanned documents and images
screen-captureActiveBuilt-inCapture screenshots of the current screen or specific regions
xlsxActiveBuilt-inRead, write, and manipulate Excel spreadsheets
1passwordAvailableMCPLook up credentials and secrets from 1Password vaults
goplacesAvailableMCPLocation search, directions, and place information
summarizeAvailableBuilt-inSummarize long text, articles, or transcripts
ttsAvailableExternalText-to-speech conversion for audio output
video-downloaderAvailableExternalDownload videos from URLs (YouTube, etc.)
video-framesAvailableExternalExtract frames from video files for analysis
weatherAvailableMCPCurrent weather and forecasts by location

memory

Persistent memory that survives across sessions. The memory skill automatically saves important context -- decisions, preferences, project notes, feedback -- and recalls them when relevant. Memory is stored in Markdown files under ~/.cli-jaw/memory/ and indexed in SQLite for fast retrieval.

PropertyValue
Skill IDmemory
CategoryUtility
StatusActive (built-in)
Storage~/.cli-jaw/memory/ (Markdown + SQLite index)
TriggerRemember, recall, note, preference, feedback

Capabilities

ActionDescription
Auto-saveAutomatically detects and persists important context from conversations
RecallRetrieves relevant memories based on semantic similarity to the current conversation
Manual saveExplicitly save a note or decision via /memory save
SearchFull-text and semantic search across all saved memories
ListBrowse all memory files organized by topic
DeleteRemove specific memory entries

Example Usage

# Manually save a note
/memory save "JWT tokens expire after 24h in this project"

# Search memories
/memory search "authentication"

# List all memory files
/memory list

# Delete a specific memory
/memory delete shared/old-decision.md
Natural language triggers:
"이거 기억해줘" -- saves the current context to memory
"지난번에 뭐라고 했었지?" -- recalls relevant past conversation context
"내가 선호하는 설정 뭐였어?" -- retrieves saved preferences
"피드백 메모해둬" -- saves feedback for future reference

Memory Structure

Memories are organized into namespaced Markdown files. The auto-memory system writes to MEMORY.md as an index, with detailed entries in topic-specific files.

~/.cli-jaw/memory/
  MEMORY.md                    # Index with links to detailed files
  feedback_goal_done_method.md # Specific feedback entry
  feedback_poll_dont_wait.md   # Another feedback entry
  shared/
    decisions.md               # Shared project decisions
    preferences.md             # User preferences

pdf

Text-based PDF processing. Reads PDF files, extracts text content, parses structure (headings, tables, lists), and supports summarization. Works best with digitally-created PDFs that contain selectable text.

PropertyValue
Skill IDpdf
CategoryUtility
StatusActive (built-in)
InputLocal file path or URL
Best forDigital PDFs with selectable text

Capabilities

ActionDescription
ReadExtract full text content from a PDF file
Page rangeRead specific pages (e.g., pages 1-5 of a large document)
SummarizeGenerate a concise summary of the PDF content
SearchFind specific text or patterns within the document
MetadataExtract title, author, page count, and other PDF metadata

Example Usage

# Read a PDF file
/pdf read ~/Documents/report.pdf

# Read specific pages
/pdf read ~/Documents/report.pdf --pages 1-5

# Summarize a PDF
/pdf summarize ~/Documents/whitepaper.pdf

# Extract metadata
/pdf info ~/Documents/contract.pdf
Natural language triggers:
"이 PDF 요약해줘" -- summarizes the PDF document
"이 문서에서 계약 조건 찾아줘" -- searches for contract terms in the PDF
"PDF 몇 페이지야?" -- returns page count and metadata
"3페이지부터 7페이지까지 읽어줘" -- reads the specified page range

pdf-vision

Vision-based PDF reading. Uses a multimodal model to interpret scanned documents, image-heavy PDFs, and pages where text extraction fails. Each page is rendered as an image and analyzed visually, making this skill essential for handwritten notes, scanned receipts, and complex layouts.

PropertyValue
Skill IDpdf-vision
CategoryUtility
StatusActive (built-in)
ModelVision-capable LLM (Claude, GPT-4V)
Best forScanned docs, handwritten text, image-heavy PDFs

When to Use pdf-vision vs pdf

ScenarioRecommendedReason
Digital PDF with selectable textpdfFaster, more accurate text extraction
Scanned document (image-only)pdf-visionNo selectable text to extract
PDF with charts and diagramspdf-visionCan describe visual elements
Handwritten notespdf-visionOCR via vision model
Mixed text + imagespdf-visionCaptures both textual and visual content
Large document (>50 pages)pdfLower latency and token cost

Example Usage

# Read a scanned PDF with vision
/pdf-vision read ~/Documents/scanned-receipt.pdf

# Analyze specific pages visually
/pdf-vision read ~/Documents/diagram-heavy.pdf --pages 3-5

# Extract text from a handwritten note
/pdf-vision read ~/Documents/handwritten.pdf
Natural language triggers:
"이 스캔 문서 읽어줘" -- reads a scanned PDF using vision
"이 영수증 내용 정리해줘" -- extracts and organizes receipt contents
"이 차트 설명해줘" -- describes charts and diagrams in the PDF
"손글씨 읽어줘" -- performs OCR on handwritten content

screen-capture

Captures screenshots of the entire screen, specific windows, or defined regions. The captured images can be analyzed by vision models, saved to disk, or used as input for other skills like vision-click.

PropertyValue
Skill IDscreen-capture
CategoryUtility
StatusActive (built-in)
OutputPNG image (file or inline)
PlatformmacOS (screencapture), Linux (scrot/gnome-screenshot)

Capabilities

ActionDescription
Full screenCapture the entire display
WindowCapture a specific application window
RegionCapture a rectangular region by coordinates
AnalyzeCapture and immediately describe the contents using a vision model

Example Usage

# Capture full screen
/screen-capture

# Capture and analyze the screen
/screen-capture --analyze

# Save to a specific path
/screen-capture --output ~/Desktop/screenshot.png

# Capture a specific region (x, y, width, height)
/screen-capture --region 0,0,800,600
Natural language triggers:
"스크린샷 찍어줘" -- captures the current screen
"지금 화면 보여줘" -- captures and displays the current screen
"이 화면에 뭐가 보여?" -- captures and analyzes the screen contents
"화면 캡처해서 저장해줘" -- captures and saves to a file

xlsx

Excel spreadsheet processing. Reads, writes, and manipulates .xlsx files -- extracting data from sheets, creating new workbooks, updating cells, and converting between formats (CSV, JSON). Supports formulas, multiple sheets, and basic formatting.

PropertyValue
Skill IDxlsx
CategoryUtility
StatusActive (built-in)
Formats.xlsx, .xls, .csv (read/write)
LibrarySheetJS (xlsx)

Capabilities

ActionDescription
ReadExtract data from spreadsheet cells, rows, and sheets
WriteCreate new workbooks or update existing ones
ConvertConvert between XLSX, CSV, and JSON formats
AnalyzeSummarize data, compute statistics, describe sheet structure
FilterExtract rows matching specific criteria

Example Usage

# Read an Excel file
/xlsx read ~/Documents/sales-data.xlsx

# Read a specific sheet
/xlsx read ~/Documents/report.xlsx --sheet "Q4 Summary"

# Convert XLSX to CSV
/xlsx convert ~/Documents/data.xlsx --to csv

# Create a new spreadsheet from data
/xlsx write ~/Documents/output.xlsx --data '[{"name":"Alice","score":95},{"name":"Bob","score":87}]'
Natural language triggers:
"이 엑셀 파일 읽어줘" -- reads and displays spreadsheet contents
"이 데이터 엑셀로 만들어줘" -- creates a new XLSX file from data
"CSV로 변환해줘" -- converts the spreadsheet to CSV format
"매출 데이터 요약해줘" -- analyzes and summarizes the spreadsheet data

1password

Secure credential lookup from 1Password vaults. The 1password skill uses the 1Password CLI (op) to search and retrieve items -- logins, passwords, secure notes, API keys -- without exposing them in plaintext to the conversation. Credentials are injected directly into commands or environment variables.

PropertyValue
Skill ID1password
CategoryUtility
StatusAvailable (requires 1Password CLI)
Prerequisiteop CLI installed and signed in
AuthBiometric or master password via op

Capabilities

ActionDescription
SearchFind items by name, tag, or vault
GetRetrieve a specific field (password, username, OTP)
InjectInject a secret into a command via op run
List vaultsShow available vaults and their item counts

Example Usage

# Search for a credential
/1password search "GitHub token"

# Get a specific password
/1password get "AWS Production" --field password

# Inject a secret into a command
op run --env-file=.env -- jaw serve

# List all vaults
/1password vaults
Natural language triggers:
"비밀번호 찾아줘" -- searches for a matching credential in 1Password
"GitHub 토큰 가져와줘" -- retrieves the GitHub token from 1Password
"AWS 키 환경변수에 넣어줘" -- injects AWS credentials into the environment
"내 비밀번호 뭐였지?" -- searches vaults for the matching item
Security: The skill never logs or displays passwords in the conversation. Credentials are retrieved via the op CLI and passed through secure references (op://vault/item/field). Ensure your 1Password CLI session is authenticated before use.

goplaces

Location-based search and navigation. Finds places, provides directions, and returns information about businesses, landmarks, and addresses. Powered by mapping APIs through an MCP server.

PropertyValue
Skill IDgoplaces
CategoryUtility
StatusAvailable (MCP server)
APIGoogle Maps / Apple Maps
PrerequisiteMaps API key configured

Capabilities

ActionDescription
Search placesFind restaurants, shops, landmarks by query
DirectionsGet route and travel time between two locations
Place detailsHours, rating, phone number, address for a place
NearbyFind places near a given location or current position

Example Usage

# Search for a place
/goplaces search "coffee shops near Gangnam Station"

# Get directions
/goplaces directions "Seoul Station" "Gangnam Station"

# Get details about a specific place
/goplaces details "Starbucks Gangnam"

# Find nearby restaurants
/goplaces nearby --type restaurant --radius 500m
Natural language triggers:
"근처 카페 찾아줘" -- searches for nearby coffee shops
"강남역에서 서울역까지 어떻게 가?" -- provides directions between locations
"이 가게 영업시간 알려줘" -- retrieves business hours for a place
"주변 맛집 추천해줘" -- finds highly-rated restaurants nearby

summarize

Condenses long text into concise summaries. Works with articles, transcripts, meeting notes, documentation, and any lengthy text input. Supports multiple summary styles -- bullet points, paragraphs, executive summaries, and key takeaways.

PropertyValue
Skill IDsummarize
CategoryUtility
StatusAvailable
InputText, file path, or URL
StylesBullets, paragraph, executive, key-takeaways

Example Usage

# Summarize a file
/summarize ~/Documents/meeting-notes.md

# Summarize with a specific style
/summarize ~/Documents/report.pdf --style bullets

# Summarize clipboard content
/summarize --clipboard

# Summarize a web article
/summarize https://example.com/long-article
Natural language triggers:
"이거 요약해줘" -- summarizes the provided content
"회의록 정리해줘" -- summarizes meeting notes into key points
"이 기사 핵심만 알려줘" -- extracts key takeaways from an article
"3줄로 요약해줘" -- produces a concise three-line summary

tts

Text-to-speech conversion. Converts text input into spoken audio using TTS engines (system voices or cloud APIs). Useful for reading documents aloud, generating audio files from text, and accessibility.

PropertyValue
Skill IDtts
CategoryUtility
StatusAvailable (external)
EnginesmacOS say, OpenAI TTS, ElevenLabs
OutputDirect playback or audio file (MP3, WAV)

Example Usage

# Speak text aloud
/tts "Hello, this is a test of the text-to-speech system."

# Save to file
/tts "Meeting summary follows..." --output ~/Desktop/summary.mp3

# Use a specific voice
/tts "안녕하세요" --voice korean --engine openai

# Read a file aloud
/tts --file ~/Documents/notes.md
Natural language triggers:
"이거 읽어줘" -- reads the content aloud using TTS
"소리로 들려줘" -- converts text to speech and plays it
"이 메모 음성 파일로 만들어줘" -- generates an audio file from text
"영어로 발음해줘" -- speaks the text in English pronunciation

video-downloader

Downloads videos from URLs using yt-dlp. Supports YouTube, Vimeo, and hundreds of other video platforms. Can select quality, extract audio only, download subtitles, and save to specified directories.

PropertyValue
Skill IDvideo-downloader
CategoryUtility
StatusAvailable (external)
Prerequisiteyt-dlp installed
PlatformsYouTube, Vimeo, Twitter, and 1000+ sites

Capabilities

ActionDescription
DownloadDownload video at best quality or specified resolution
Audio onlyExtract and save only the audio track (MP3, M4A)
SubtitlesDownload subtitles in specified language
InfoShow video metadata without downloading
PlaylistDownload all videos in a playlist

Example Usage

# Download a video
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ

# Download audio only
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --audio

# Download with specific quality
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --quality 720p

# Download subtitles
/video-downloader https://www.youtube.com/watch?v=dQw4w9WgXcQ --subs ko
Natural language triggers:
"이 영상 다운받아줘" -- downloads the video from the given URL
"음성만 추출해줘" -- extracts audio track from the video
"자막 다운받아줘" -- downloads subtitles for the video
"이 유튜브 영상 저장해줘" -- downloads and saves the YouTube video

video-frames

Extracts individual frames from video files for analysis. Captures frames at specified intervals or timestamps, outputs them as images, and optionally sends them to a vision model for description. Useful for video summarization, content analysis, and thumbnail generation.

PropertyValue
Skill IDvideo-frames
CategoryUtility
StatusAvailable (external)
Prerequisiteffmpeg installed
OutputPNG/JPEG frame images

Example Usage

# Extract one frame per second
/video-frames ~/Videos/demo.mp4 --interval 1s

# Extract frame at a specific timestamp
/video-frames ~/Videos/demo.mp4 --at 01:23:45

# Extract and analyze frames with vision model
/video-frames ~/Videos/demo.mp4 --interval 10s --analyze

# Generate thumbnails for a video
/video-frames ~/Videos/demo.mp4 --count 5 --output ~/Desktop/thumbs/
Natural language triggers:
"이 영상에서 프레임 추출해줘" -- extracts frames from the video
"영상 내용 분석해줘" -- extracts frames and describes the video content
"썸네일 만들어줘" -- generates representative thumbnail images
"1분 30초 장면 캡처해줘" -- captures the frame at the specified timestamp

weather

Current weather conditions and forecasts. Provides temperature, humidity, wind, precipitation, and multi-day forecasts for any location. Implemented as an MCP server that queries weather APIs.

PropertyValue
Skill IDweather
CategoryUtility
StatusAvailable (MCP server)
APIOpenWeatherMap / WeatherAPI
DataCurrent conditions, hourly, daily forecast

Capabilities

ActionDescription
CurrentCurrent temperature, humidity, wind, conditions
ForecastMulti-day forecast with highs, lows, precipitation
HourlyHour-by-hour breakdown for the next 24-48 hours
AlertsSevere weather alerts and warnings for the area

Example Usage

# Get current weather
/weather Seoul

# Get a 5-day forecast
/weather forecast "San Francisco" --days 5

# Get hourly forecast
/weather hourly Tokyo --hours 24

# Check weather alerts
/weather alerts "New York"
Natural language triggers:
"날씨 알려줘" -- shows current weather for the default location
"서울 날씨 어때?" -- shows current weather in Seoul
"내일 비 와?" -- checks tomorrow's precipitation forecast
"이번 주 날씨 예보 알려줘" -- provides the weekly weather forecast
"우산 가져가야 해?" -- checks rain probability and advises accordingly

Configuration

{
  "mcpServers": {
    "weather": {
      "command": "npx",
      "args": ["-y", "weather-mcp-server"],
      "env": {
        "WEATHER_API_KEY": "your-api-key",
        "WEATHER_DEFAULT_LOCATION": "Seoul"
      }
    }
  }
}

Skill Comparison

Quick reference for choosing the right utility skill for your task.

Document Processing

TaskSkillNotes
Read a digital PDFpdfFast text extraction, supports page ranges
Read a scanned PDFpdf-visionVision-based OCR, slower but handles images
Read/write ExcelxlsxFull XLSX/CSV/JSON support
Summarize any textsummarizeWorks with files, URLs, and clipboard

Media and Capture

TaskSkillNotes
Screenshotscreen-captureFull screen, window, or region
Download videovideo-downloaderRequires yt-dlp
Extract video framesvideo-framesRequires ffmpeg
Text to speechttsSystem or cloud voices

Information and Secrets

TaskSkillNotes
Look up credentials1passwordRequires op CLI
Find places / directionsgoplacesRequires Maps API key
Check weatherweatherRequires Weather API key
Remember contextmemoryAlways active, auto-saves

Common Patterns

Document Pipeline

Combine PDF and summarization skills to process documents end-to-end.

# Read a PDF, summarize it, and save the summary
"이 PDF 읽고 요약해서 엑셀로 정리해줘"

# The agent chains:
# 1. pdf → read the document
# 2. summarize → condense key points
# 3. xlsx → write summary to a spreadsheet

Research Workflow

Use weather, location, and memory skills together for planning tasks.

# Plan a trip
"이번 주말 부산 날씨 알려주고, 근처 맛집 찾아줘. 결과 기억해둬."

# The agent chains:
# 1. weather → forecast for Busan this weekend
# 2. goplaces → search restaurants near Busan
# 3. memory → save results for later reference

Secure Development

Pull credentials from 1Password and inject them into your dev environment.

# Set up dev environment with secure credentials
"1Password에서 DB 비밀번호 찾아서 .env에 넣어줘"

# The agent chains:
# 1. 1password → get "Database Production" password
# 2. inject into .env file via op:// reference
Note: When chaining skills, the agent determines the optimal execution order automatically. You do not need to specify individual skill invocations -- just describe what you want in natural language.