Compare commits

...

14 Commits

Author SHA1 Message Date
J. Nick Koston
5f155c90b7 Merge branch 'dev' into store-yaml-firmware 2026-06-09 20:51:44 -05:00
J. Nick Koston
a8d5ffc141 Merge branch 'dev' into store-yaml-firmware 2026-05-26 09:17:39 -05:00
J. Nick Koston
a9ec101631 [store_yaml] Use memcpy_P on ESP8266, std::memcpy elsewhere
Byte-by-byte `progmem_read_byte` for a 512-byte chunk does 512 aligned
32-bit flash reads + shifts on ESP8266. Switching to `memcpy_P` lets the
SDK do bulk aligned-flash copies. Other platforms get a plain
`std::memcpy` since their `PROGMEM` is a no-op and the blob lives in
normal address space.

`memcpy_P` is only available on ESP8266 (via `<pgmspace.h>`), so the
implementation is platform-gated; both paths boil down to a one-call
copy.
2026-05-15 22:36:15 -07:00
J. Nick Koston
cd2c1014f3 [store_yaml] Address review: drop esp32-ard test, force on data field
- Remove `test.esp32-ard.yaml`; the esp32 platform is IDF-only for new
  component tests.
- `(force) = true` on `GetYamlResponse.data` so the field is always
  emitted on the wire (matches the camera streaming pattern). `done`
  stays unforced — it's `false` on every chunk except the last, and
  default-value elision saves ~2 bytes per chunk for the firmware to
  TX.
- `total_size` and `encoding` stay first-chunk-only; the client caches
  them. Firmware bandwidth is expensive, client RAM is not.
2026-05-15 10:33:44 -07:00
J. Nick Koston
41081b7278 [store_yaml] Address review: ConfigType typing and drop redundant priority override
- Annotate `_final_validate` and `to_code` with `ConfigType` parameter and
  return types.
- Drop the `get_setup_priority()` override on `StoreYamlComponent`; the
  default `Component::get_setup_priority()` already returns
  `setup_priority::DATA`, so the override was a no-op.
2026-05-15 10:29:05 -07:00
J. Nick Koston
f06e96685b [store_yaml] Refactor YAML discovery to share bundle.py's approach
The previous implementation extended `track_yaml_loads` across the entire
validation pass to catch deferred `!include` and package loads, but that also
captured framework YAML loaded internally by component validators (e.g.
LVGL's `hello_world.yaml`) and produced spurious "unresolved substitution"
warnings during the pre-validation force-load. bundle.py already had the
right pattern: a fresh post-validation re-parse plus `force_load_include_files`.

- Lift the discovery into `yaml_util.discover_user_yaml_files` and have both
  `bundle.py` and `config.py` use it (DRY).
- Capture `secrets.yaml` / `secrets.yml` by the *un-resolved* listener fname
  so a `secrets.yaml` symlinked to a non-secrets-named target is still
  flagged for redaction.
- Add a `warn_on_unresolved` flag to `force_load_include_files` so the
  discovery path (where substitutions haven't run) logs at debug instead of
  warning.
- Reject `store_yaml` configs with an unencrypted API via
  `FINAL_VALIDATE_SCHEMA`; an explicit `allow_unencrypted: true` opt-out
  keeps lab setups working (renamed from `allow_unencrypted_api` so the
  integration-test harness's naive `api:` string replacement doesn't
  clobber it).
- Annotate `store_yaml_chunk_buf` with the
  `cppcoreguidelines-avoid-non-const-global-variables` suppression that
  matches other intentional API-side globals.
- Update unit tests to exercise the new `DiscoveredYamlFiles` shape plus a
  symlink-secrets case.
2026-05-15 10:21:00 -07:00
J. Nick Koston
6493fdaba1 [store_yaml] Add end-to-end integration test for native-API recovery
Compiles a host build with `store_yaml`, drives a raw plaintext API socket
(the released aioesphomeapi does not yet know about GetYamlRequest /
GetYamlResponse and would silently drop the streamed bytes as "unknown
message type"), sends GetYamlRequest, accumulates the streamed
GetYamlResponse chunks until done=true, zstd-decompresses, and verifies
both the envelope structure and that the fixture's distinctive markers
(`store-yaml-test`, `store_yaml:`) round-trip back through the recovery
blob.
2026-05-15 10:20:45 -07:00
J. Nick Koston
3d77e3f5dd [store_yaml] Address review: capability bit, dump suppression, paths, unit tests
- Advertise `has_store_yaml` in DeviceInfoResponse so recovery tooling can
  detect support without timing out against firmware built without
  USE_STORE_YAML.
- Suppress GetYamlResponse from the proto dump path. Every chunk would
  otherwise log embedded configuration (including opted-in secrets) on
  builds with HAS_PROTO_MESSAGE_DUMP, and bloat logs during recovery.
- Preserve the include graph for files outside the project root: use
  `os.path.relpath` instead of just the basename so e.g. two
  `../common.yaml` siblings don't collide on recovery.
- Keep `track_yaml_loads` open across `validate_config` so files loaded
  by remote packages and substitution-resolved includes are captured.
- Add focused unit tests for `_gather_files` (redaction, secrets.yml,
  opt-in, dedupe, external-path handling, missing sources) and
  `_pack_envelope` (round-trip, UTF-8 paths, overlong-path guard).
- Make the test_bundle assertion case-insensitive.
2026-05-15 10:20:45 -07:00
J. Nick Koston
7f5de80f81 [store_yaml] Drop dead re-resolve and duplicated SECRETS_FILES check
`track_yaml_loads` already returns resolved Path objects, so re-wrapping
each entry in `Path(...).resolve()` was a no-op. The OR'd
`Path(path).name in SECRETS_FILES` branch always evaluated the same as
the first check on the resolved name; removing it. Document the symlink
edge case the resolver swallows so a future change to the loader can
revisit it.
2026-05-15 10:20:45 -07:00
J. Nick Koston
14628ab3a5 [store_yaml] Add return type annotation on _import_zstd 2026-05-15 10:20:45 -07:00
J. Nick Koston
150a65de8c [store_yaml] Force-load deferred includes; cover secrets.yml
- Promote `_force_load_include_files` from bundle.py to yaml_util.py as
  `force_load_include_files`. bundle.py and the new caller now share it.
- Call it inside the `track_yaml_loads` block in config.py so deferred
  `!include` values, packages, and similar are loaded while the listener
  is active. Without this, `CORE.data["yaml_sources"]` only contained the
  entry YAML, and store_yaml shipped an incomplete recovery blob.
- Treat both `secrets.yaml` and `secrets.yml` as secrets (matches
  `esphome.const.SECRETS_FILES`); also check the original (non-resolved)
  filename to catch a `secrets.yaml` symlinked to an unrelated target.
- Update bundle tests for the new import path and message wording.
2026-05-15 10:20:45 -07:00
J. Nick Koston
783afaeaf2 [store_yaml] Streaming: advance pos only on successful send
Drop the pre-advance + rewind-on-failure dance in try_send_store_yaml_; match
the camera streaming pattern (advance after send_message succeeds, retry the
same chunk on WOULD_BLOCK). Also remove the unreachable `|| pos == 0` branch
in the while condition — on_get_yaml_request short-circuits the zero-size
path before this function is ever called.
2026-05-15 10:20:04 -07:00
J. Nick Koston
59e8242756 [store_yaml] Address review: combined dump_config, encapsulate PROGMEM read
- Coalesce dump_config into a single ESP_LOGCONFIG call.
- Replace the raw get_data() accessor with a read_chunk() helper that copies
  bytes via progmem_read_byte, so callers cannot accidentally dereference the
  PROGMEM pointer directly on ESP8266.
- Update the API streaming handler to use read_chunk().
2026-05-15 10:20:04 -07:00
J. Nick Koston
d0510364c9 [store_yaml] Embed user YAML in firmware for recovery
Adds a new opt-in component that compresses the on-disk YAML files with zstd
at codegen time and stores them in PROGMEM. The native API exposes a chunked
GetYaml RPC so a lost configuration can be retrieved from a running device.
Decompression happens client-side; no decompressor is shipped on-device.
2026-05-15 10:20:04 -07:00
26 changed files with 918 additions and 3 deletions

View File

@@ -505,6 +505,7 @@ esphome/components/st7735/* @SenexCrenshaw
esphome/components/st7789v/* @kbx81
esphome/components/st7920/* @marsjan155
esphome/components/statsd/* @Links2004
esphome/components/store_yaml/* @bdraco
esphome/components/stts22h/* @B48D81EFCC
esphome/components/substitutions/* @esphome/core
esphome/components/sun/* @OttoWinter

View File

@@ -76,6 +76,8 @@ service APIConnection {
rpc serial_proxy_set_modem_pins(SerialProxySetModemPinsRequest) returns (void) {}
rpc serial_proxy_get_modem_pins(SerialProxyGetModemPinsRequest) returns (void) {}
rpc serial_proxy_request(SerialProxyRequest) returns (void) {}
rpc get_yaml(GetYamlRequest) returns (void) {}
}
@@ -296,6 +298,11 @@ message DeviceInfoResponse {
// Serial proxy instance metadata
repeated SerialProxyInfo serial_proxies = 25 [(field_ifdef) = "USE_SERIAL_PROXY", (fixed_array_size_define) = "SERIAL_PROXY_COUNT"];
// Whether this firmware embeds its YAML configuration for recovery via
// `get_yaml`. Clients use this to skip the request entirely when the
// device cannot answer it instead of waiting for a timeout.
bool has_store_yaml = 26 [(field_ifdef) = "USE_STORE_YAML"];
}
message ListEntitiesRequest {
@@ -2733,3 +2740,27 @@ message BluetoothSetConnectionParamsResponse {
uint64 address = 1;
int32 error = 2;
}
// ==================== STORE YAML ====================
// Embed the user's YAML in firmware and stream it back over the API so a lost
// config can be recovered from a running device. The device only stores the
// compressed bytes; decompression happens client-side.
message GetYamlRequest {
option (id) = 149;
option (source) = SOURCE_CLIENT;
option (ifdef) = "USE_STORE_YAML";
option (no_delay) = true;
}
message GetYamlResponse {
option (id) = 150;
option (source) = SOURCE_SERVER;
option (ifdef) = "USE_STORE_YAML";
bytes data = 1 [(force) = true];
bool done = 2;
// Sent on the first chunk only — the client is expected to cache it.
// (firmware bandwidth/flash is expensive; the client has gigabytes.)
uint32 total_size = 3;
string encoding = 4 [(max_data_length) = 8];
}

View File

@@ -53,6 +53,9 @@
#ifdef USE_RADIO_FREQUENCY
#include "esphome/components/radio_frequency/radio_frequency.h"
#endif
#ifdef USE_STORE_YAML
#include "esphome/components/store_yaml/store_yaml.h"
#endif
namespace esphome::api {
@@ -311,6 +314,10 @@ void APIConnection::loop() {
// (missing a frame is fine, missing a state update is not)
this->try_send_camera_image_();
#endif
#ifdef USE_STORE_YAML
this->try_send_store_yaml_();
#endif
}
void APIConnection::check_keepalive_(uint32_t now) {
@@ -1165,6 +1172,73 @@ void APIConnection::on_camera_image_request(const CameraImageRequest &msg) {
}
#endif
#ifdef USE_STORE_YAML
// Chunk size per GetYamlResponse. Small enough to leave room for the protobuf frame
// inside the 65535-byte API limit and friendly to TCP MSS.
static constexpr size_t STORE_YAML_CHUNK_SIZE = 512;
// Scratch buffer used to copy a chunk from PROGMEM (needed on ESP8266; harmless
// elsewhere). Shared across connections is safe because the API loop is
// single-threaded and each chunk is filled and consumed atomically inside one
// `try_send_store_yaml_` iteration.
// NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)
static uint8_t store_yaml_chunk_buf[STORE_YAML_CHUNK_SIZE];
void APIConnection::on_get_yaml_request() {
auto *comp = store_yaml::global_store_yaml;
if (comp == nullptr || comp->get_size() == 0) {
// No blob — send a single done=true response so the client doesn't hang.
GetYamlResponse resp;
resp.done = true;
this->send_message(resp);
return;
}
this->store_yaml_pos_ = 0;
this->try_send_store_yaml_();
}
void APIConnection::try_send_store_yaml_() {
if (this->store_yaml_pos_ == std::numeric_limits<size_t>::max())
return;
auto *comp = store_yaml::global_store_yaml;
if (comp == nullptr) {
this->store_yaml_pos_ = std::numeric_limits<size_t>::max();
return;
}
const size_t total = comp->get_size();
// Camera-style streaming: advance the position only after a successful send,
// so a WOULD_BLOCK simply retries the same chunk on the next loop iteration.
while (this->store_yaml_pos_ < total) {
if (!this->helper_->can_write_without_blocking())
return;
const size_t remaining = total - this->store_yaml_pos_;
const size_t to_send = remaining < STORE_YAML_CHUNK_SIZE ? remaining : STORE_YAML_CHUNK_SIZE;
// Copy a chunk out of PROGMEM into a stack buffer; on ESP8266 this routes
// through progmem_read_byte, on every other platform it's a plain byte copy.
comp->read_chunk(this->store_yaml_pos_, store_yaml_chunk_buf, to_send);
GetYamlResponse resp;
resp.set_data(store_yaml_chunk_buf, to_send);
if (this->store_yaml_pos_ == 0) {
resp.total_size = static_cast<uint32_t>(total);
resp.encoding = StringRef(store_yaml::ENCODING);
}
resp.done = (this->store_yaml_pos_ + to_send) >= total;
if (!this->send_message(resp))
return; // retry on next loop, pos unchanged
this->store_yaml_pos_ += to_send;
}
// Reached end successfully — final response (with done=true) already sent above.
this->store_yaml_pos_ = std::numeric_limits<size_t>::max();
}
#endif
#ifdef USE_HOMEASSISTANT_TIME
void APIConnection::on_get_time_response(const GetTimeResponse &value) {
if (homeassistant::global_homeassistant_time != nullptr) {
@@ -1808,6 +1882,9 @@ bool APIConnection::send_device_info_response_() {
#ifdef USE_DEEP_SLEEP
resp.has_deep_sleep = deep_sleep::global_has_deep_sleep;
#endif
#ifdef USE_STORE_YAML
resp.has_store_yaml = store_yaml::global_store_yaml != nullptr && store_yaml::global_store_yaml->get_size() > 0;
#endif
#ifdef ESPHOME_PROJECT_NAME
#ifdef USE_ESP8266
static const char PROJECT_NAME_PROGMEM[] PROGMEM = ESPHOME_PROJECT_NAME;
@@ -2055,10 +2132,15 @@ bool APIConnection::try_to_clear_buffer_slow_(bool log_out_of_space) {
bool APIConnection::send_message_(uint32_t payload_size, uint8_t message_type, MessageEncodeFn encode_fn,
const void *msg) {
#ifdef HAS_PROTO_MESSAGE_DUMP
// Skip dump for log messages (recursive logging risk) and camera frames (high-frequency noise)
// Skip dump for log messages (recursive logging risk), camera frames (high-frequency noise),
// and YAML recovery payloads (every chunk would log the embedded config, including any
// secrets the user opted into).
if (message_type != SubscribeLogsResponse::MESSAGE_TYPE
#ifdef USE_CAMERA
&& message_type != CameraImageResponse::MESSAGE_TYPE
#endif
#ifdef USE_STORE_YAML
&& message_type != GetYamlResponse::MESSAGE_TYPE
#endif
) {
auto *proto_msg = static_cast<const ProtoMessage *>(msg);

View File

@@ -124,6 +124,9 @@ class APIConnection final : public APIServerConnectionBase {
void set_camera_state(std::shared_ptr<camera::CameraImage> image);
void on_camera_image_request(const CameraImageRequest &msg);
#endif
#ifdef USE_STORE_YAML
void on_get_yaml_request();
#endif
#ifdef USE_CLIMATE
bool send_climate_state(climate::Climate *climate);
void on_climate_command_request(const ClimateCommandRequest &msg);
@@ -398,6 +401,12 @@ class APIConnection final : public APIServerConnectionBase {
void try_send_camera_image_();
#endif
#ifdef USE_STORE_YAML
void try_send_store_yaml_();
// Streaming offset into the PROGMEM blob; max() means "not streaming".
size_t store_yaml_pos_{std::numeric_limits<size_t>::max()};
#endif
#ifdef USE_API_HOMEASSISTANT_STATES
void process_state_subscriptions_();
#endif

View File

@@ -150,6 +150,9 @@ uint8_t *DeviceInfoResponse::encode(ProtoWriteBuffer &buffer PROTO_ENCODE_DEBUG_
for (const auto &it : this->serial_proxies) {
ProtoEncode::encode_sub_message(pos PROTO_ENCODE_DEBUG_ARG, buffer, 25, it);
}
#endif
#ifdef USE_STORE_YAML
ProtoEncode::encode_bool(pos PROTO_ENCODE_DEBUG_ARG, 26, this->has_store_yaml);
#endif
return pos;
}
@@ -212,6 +215,9 @@ uint32_t DeviceInfoResponse::calculate_size() const {
for (const auto &it : this->serial_proxies) {
size += ProtoSize::calc_message_force(2, it.calculate_size());
}
#endif
#ifdef USE_STORE_YAML
size += ProtoSize::calc_bool(2, this->has_store_yaml);
#endif
return size;
}
@@ -4155,5 +4161,25 @@ uint32_t BluetoothSetConnectionParamsResponse::calculate_size() const {
return size;
}
#endif
#ifdef USE_STORE_YAML
uint8_t *GetYamlResponse::encode(ProtoWriteBuffer &buffer PROTO_ENCODE_DEBUG_PARAM) const {
uint8_t *__restrict__ pos = buffer.get_pos();
ProtoEncode::write_raw_byte(pos PROTO_ENCODE_DEBUG_ARG, 10);
ProtoEncode::encode_varint_raw(pos PROTO_ENCODE_DEBUG_ARG, this->data_len_);
ProtoEncode::encode_raw(pos PROTO_ENCODE_DEBUG_ARG, this->data_ptr_, this->data_len_);
ProtoEncode::encode_bool(pos PROTO_ENCODE_DEBUG_ARG, 2, this->done);
ProtoEncode::encode_uint32(pos PROTO_ENCODE_DEBUG_ARG, 3, this->total_size);
ProtoEncode::encode_string(pos PROTO_ENCODE_DEBUG_ARG, 4, this->encoding);
return pos;
}
uint32_t GetYamlResponse::calculate_size() const {
uint32_t size = 0;
size += ProtoSize::calc_length_force(1, this->data_len_);
size += ProtoSize::calc_bool(1, this->done);
size += ProtoSize::calc_uint32(1, this->total_size);
size += !this->encoding.empty() ? 2 + this->encoding.size() : 0;
return size;
}
#endif
} // namespace esphome::api

View File

@@ -525,7 +525,7 @@ class SerialProxyInfo final : public ProtoMessage {
class DeviceInfoResponse final : public ProtoMessage {
public:
static constexpr uint8_t MESSAGE_TYPE = 10;
static constexpr uint16_t ESTIMATED_SIZE = 309;
static constexpr uint16_t ESTIMATED_SIZE = 312;
#ifdef HAS_PROTO_MESSAGE_DUMP
const LogString *message_name() const override { return LOG_STR("device_info_response"); }
#endif
@@ -580,6 +580,9 @@ class DeviceInfoResponse final : public ProtoMessage {
#endif
#ifdef USE_SERIAL_PROXY
std::array<SerialProxyInfo, SERIAL_PROXY_COUNT> serial_proxies{};
#endif
#ifdef USE_STORE_YAML
bool has_store_yaml{false};
#endif
uint8_t *encode(ProtoWriteBuffer &buffer PROTO_ENCODE_DEBUG_PARAM) const;
uint32_t calculate_size() const;
@@ -3317,5 +3320,31 @@ class BluetoothSetConnectionParamsResponse final : public ProtoMessage {
protected:
};
#endif
#ifdef USE_STORE_YAML
class GetYamlResponse final : public ProtoMessage {
public:
static constexpr uint8_t MESSAGE_TYPE = 150;
static constexpr uint8_t ESTIMATED_SIZE = 34;
#ifdef HAS_PROTO_MESSAGE_DUMP
const LogString *message_name() const override { return LOG_STR("get_yaml_response"); }
#endif
const uint8_t *data_ptr_{nullptr};
size_t data_len_{0};
void set_data(const uint8_t *data, size_t len) {
this->data_ptr_ = data;
this->data_len_ = len;
}
bool done{false};
uint32_t total_size{0};
StringRef encoding{};
uint8_t *encode(ProtoWriteBuffer &buffer PROTO_ENCODE_DEBUG_PARAM) const;
uint32_t calculate_size() const;
#ifdef HAS_PROTO_MESSAGE_DUMP
const char *dump_to(DumpBuffer &out) const override;
#endif
protected:
};
#endif
} // namespace esphome::api

View File

@@ -971,6 +971,9 @@ const char *DeviceInfoResponse::dump_to(DumpBuffer &out) const {
it.dump_to(out);
out.append("\n");
}
#endif
#ifdef USE_STORE_YAML
dump_field(out, ESPHOME_PSTR("has_store_yaml"), this->has_store_yaml);
#endif
return out.c_str();
}
@@ -2718,6 +2721,16 @@ const char *BluetoothSetConnectionParamsResponse::dump_to(DumpBuffer &out) const
return out.c_str();
}
#endif
#ifdef USE_STORE_YAML
const char *GetYamlResponse::dump_to(DumpBuffer &out) const {
MessageDumpHelper helper(out, ESPHOME_PSTR("GetYamlResponse"));
dump_bytes_field(out, ESPHOME_PSTR("data"), this->data_ptr_, this->data_len_);
dump_field(out, ESPHOME_PSTR("done"), this->done);
dump_field(out, ESPHOME_PSTR("total_size"), this->total_size);
dump_field(out, ESPHOME_PSTR("encoding"), this->encoding);
return out.c_str();
}
#endif
} // namespace esphome::api

View File

@@ -702,6 +702,15 @@ void APIConnection::read_message_(uint32_t msg_size, uint32_t msg_type, const ui
this->on_bluetooth_set_connection_params_request(msg);
break;
}
#endif
#ifdef USE_STORE_YAML
case 149 /* GetYamlRequest is empty */: {
#ifdef HAS_PROTO_MESSAGE_DUMP
this->log_receive_message_(LOG_STR("on_get_yaml_request"));
#endif
this->on_get_yaml_request();
break;
}
#endif
default:
break;

View File

@@ -236,6 +236,10 @@ class APIServerConnectionBase {
#ifdef USE_BLUETOOTH_PROXY
void on_bluetooth_set_connection_params_request(const BluetoothSetConnectionParamsRequest &value){};
#endif
#ifdef USE_STORE_YAML
void on_get_yaml_request(){};
#endif
};
} // namespace esphome::api

View File

@@ -0,0 +1,171 @@
from __future__ import annotations
import logging
import os
from pathlib import Path
import struct
from types import ModuleType
import esphome.codegen as cg
import esphome.config_validation as cv
from esphome.const import CONF_API, CONF_ID, CONF_RAW_DATA_ID
from esphome.core import CORE, EsphomeError, HexInt
import esphome.final_validate as fv
from esphome.types import ConfigType
_LOGGER = logging.getLogger(__name__)
CODEOWNERS = ["@bdraco"]
DEPENDENCIES = ["api"]
CONF_INCLUDE_SECRETS = "include_secrets"
# Avoid an `_api:` substring in the key name so the integration-test harness
# (which naively str-replaces `api:` to inject a port directive) doesn't
# clobber configs that opt into this escape hatch.
CONF_ALLOW_UNENCRYPTED = "allow_unencrypted"
store_yaml_ns = cg.esphome_ns.namespace("store_yaml")
StoreYamlComponent = store_yaml_ns.class_("StoreYamlComponent", cg.Component)
# Compression level for zstd; 22 is the max and gives ~70-90% reduction on YAML.
ZSTD_LEVEL = 22
# Envelope magic: "EHY1" = ESPHome YAML, version 1.
ENVELOPE_MAGIC = b"EHY1"
# Replacement content when secrets are not included.
REDACTED_PLACEHOLDER = b"# redacted\n"
CONFIG_SCHEMA = cv.Schema(
{
cv.GenerateID(): cv.declare_id(StoreYamlComponent),
cv.GenerateID(CONF_RAW_DATA_ID): cv.declare_id(cg.uint8),
cv.Optional(CONF_INCLUDE_SECRETS, default=False): cv.boolean,
cv.Optional(CONF_ALLOW_UNENCRYPTED, default=False): cv.boolean,
}
).extend(cv.COMPONENT_SCHEMA)
def _final_validate(config: ConfigType) -> ConfigType:
"""Require API encryption: an unauthenticated client could otherwise pull
the embedded YAML (which may include Wi-Fi credentials or opted-in
secrets). The escape hatch ``allow_unencrypted_api: true`` exists for
isolated lab setups where the user has accepted the trade-off."""
full = fv.full_config.get()
api_conf = full.get(CONF_API, {})
if api_conf.get("encryption"):
return config
if config.get(CONF_ALLOW_UNENCRYPTED):
_LOGGER.warning(
"store_yaml is enabled without API encryption; any client that can "
"reach the device on the network can pull the embedded YAML."
)
return config
raise cv.Invalid(
"store_yaml requires API encryption (configure `api.encryption.key`). "
"Without encryption, the embedded YAML — which may contain Wi-Fi "
"credentials or opted-in secrets — can be read by any client that "
"reaches the device. Set `store_yaml.allow_unencrypted: true` to "
"override after acknowledging the risk."
)
FINAL_VALIDATE_SCHEMA = _final_validate
def _import_zstd() -> ModuleType:
try:
from compression import zstd # noqa: PLC0415 — Python 3.14+ stdlib
except ImportError:
try:
from backports import zstd # noqa: PLC0415
except ImportError as err:
raise EsphomeError(
"store_yaml requires zstd compression. Install backports.zstd for "
"Python < 3.14 or upgrade to Python 3.14+."
) from err
return zstd
def _gather_files(include_secrets: bool) -> list[tuple[str, bytes]]:
"""Read each YAML file the config loader touched, return (relative_path, content) pairs."""
discovered = CORE.data.get("yaml_sources")
if not discovered or not discovered.files:
raise EsphomeError(
"store_yaml could not find any tracked YAML files; the config loader "
"did not populate CORE.data['yaml_sources']."
)
config_path = Path(CORE.config_path).resolve()
root = config_path.parent
secret_paths = discovered.secrets
files: list[tuple[str, bytes]] = []
for path in discovered.files:
# `secret_paths` was collected from the *un-resolved* basename, so a
# `secrets.yaml` symlinked to a differently-named target is still
# treated as secrets here.
if path in secret_paths and not include_secrets:
content = REDACTED_PLACEHOLDER
else:
try:
content = path.read_bytes()
except OSError as err:
_LOGGER.warning("store_yaml: skipping unreadable %s (%s)", path, err)
continue
try:
rel_str = path.relative_to(root).as_posix()
except ValueError:
# Outside the project root (e.g. ../common.yaml or a secrets file in
# $HOME). Use a relative path with ".." components instead of just
# the basename so the include graph is preserved and files from
# different directories with the same basename don't collide.
rel_str = os.path.relpath(path, root).replace(os.sep, "/")
files.append((rel_str, content))
return files
def _pack_envelope(files: list[tuple[str, bytes]]) -> bytes:
"""Pack files into the EHY1 envelope.
Layout: magic (4) | u32 file_count | repeat { u16 path_len | path_utf8 | u32 content_len | content_bytes }
All integers are little-endian.
"""
parts: list[bytes] = [ENVELOPE_MAGIC, struct.pack("<I", len(files))]
for path, content in files:
path_bytes = path.encode("utf-8")
if len(path_bytes) > 0xFFFF:
raise EsphomeError(
f"store_yaml: path too long ({len(path_bytes)} bytes): {path}"
)
parts.append(struct.pack("<H", len(path_bytes)))
parts.append(path_bytes)
parts.append(struct.pack("<I", len(content)))
parts.append(content)
return b"".join(parts)
async def to_code(config: ConfigType) -> None:
cg.add_define("USE_STORE_YAML")
zstd = _import_zstd()
files = _gather_files(config[CONF_INCLUDE_SECRETS])
envelope = _pack_envelope(files)
compressed = zstd.compress(envelope, level=ZSTD_LEVEL)
_LOGGER.info(
"store_yaml: embedding %d file(s) as %d bytes (%d uncompressed, %.1f%% ratio)",
len(files),
len(compressed),
len(envelope),
100.0 * len(compressed) / max(1, len(envelope)),
)
rhs = [HexInt(b) for b in compressed]
prog_arr = cg.progmem_array(config[CONF_RAW_DATA_ID], rhs)
var = cg.new_Pvariable(config[CONF_ID])
await cg.register_component(var, config)
cg.add(var.set_data(prog_arr, len(compressed), len(envelope)))

View File

@@ -0,0 +1,43 @@
#include "store_yaml.h"
#ifdef USE_STORE_YAML
#include "esphome/core/log.h"
#include <cstring>
#ifdef USE_ESP8266
#include <pgmspace.h>
#endif
namespace esphome::store_yaml {
static const char *const TAG = "store_yaml";
// NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)
StoreYamlComponent *global_store_yaml = nullptr;
void StoreYamlComponent::setup() { global_store_yaml = this; }
void StoreYamlComponent::dump_config() {
ESP_LOGCONFIG(TAG,
"YAML:\n"
" Compressed size: %zu bytes\n"
" Uncompressed size: %zu bytes\n"
" Encoding: %s",
this->size_, this->uncompressed_size_, ENCODING);
}
void StoreYamlComponent::read_chunk(size_t pos, uint8_t *dst, size_t len) const {
#ifdef USE_ESP8266
// ESP8266 needs `memcpy_P` for aligned bulk flash reads; the byte-by-byte
// `progmem_read_byte` loop would otherwise emit ~4x as many flash accesses.
memcpy_P(dst, this->data_ + pos, len);
#else
// PROGMEM is a no-op everywhere else and the data lives in normal address
// space, so a plain `std::memcpy` is correct and the fast path.
std::memcpy(dst, this->data_ + pos, len);
#endif
}
} // namespace esphome::store_yaml
#endif // USE_STORE_YAML

View File

@@ -0,0 +1,48 @@
#pragma once
#include "esphome/core/defines.h"
#ifdef USE_STORE_YAML
#include "esphome/core/component.h"
#include "esphome/core/hal.h"
namespace esphome::store_yaml {
// "zstd" — published in GetYamlResponse.encoding so clients know how to decompress.
constexpr const char *ENCODING = "zstd";
class StoreYamlComponent : public Component {
public:
void setup() override;
void dump_config() override;
// Called once from codegen with the PROGMEM blob.
void set_data(const uint8_t *data, size_t size, size_t uncompressed_size) {
this->data_ = data;
this->size_ = size;
this->uncompressed_size_ = uncompressed_size;
}
size_t get_size() const { return this->size_; }
size_t get_uncompressed_size() const { return this->uncompressed_size_; }
// Copy `len` bytes from the PROGMEM blob at offset `pos` into `dst`.
// Hides the platform-specific read (no-op everywhere except ESP8266, where the
// blob lives in code space and must be read through `progmem_read_byte`).
void read_chunk(size_t pos, uint8_t *dst, size_t len) const;
protected:
// Points to a `const uint8_t[] PROGMEM` array emitted by codegen. On ESP8266 this
// address is in instruction flash and must be accessed via `progmem_read_byte` —
// hence the `read_chunk` accessor above. There is no public getter for the raw
// pointer; callers must go through `read_chunk`.
const uint8_t *data_{nullptr};
size_t size_{0};
size_t uncompressed_size_{0};
};
// NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)
extern StoreYamlComponent *global_store_yaml;
} // namespace esphome::store_yaml
#endif // USE_STORE_YAML

View File

@@ -1211,13 +1211,24 @@ def _load_config(
raise InvalidYAMLError(e) from e
try:
return validate_config(config, command_line_substitutions, skip_external_update)
result = validate_config(
config, command_line_substitutions, skip_external_update
)
except EsphomeError:
raise
except Exception:
_LOGGER.error("Unexpected exception while reading configuration:")
raise
# Discover the user's on-disk YAML files via a fresh re-parse — same
# pattern bundle.py uses. Doing it post-validation (rather than keeping a
# listener installed across validation) avoids capturing framework YAML
# that components load internally (e.g. LVGL's `hello_world.yaml`). The
# result is consumed by components like store_yaml that want to embed the
# user's configuration in firmware for recovery.
CORE.data["yaml_sources"] = yaml_util.discover_user_yaml_files(CORE.config_path)
return result
def load_config(
command_line_substitutions: dict[str, Any], skip_external_update: bool = False

View File

@@ -198,6 +198,7 @@
#define USE_API_CUSTOM_SERVICES
#define USE_API_USER_DEFINED_ACTION_RESPONSES
#define USE_API_USER_DEFINED_ACTION_RESPONSES_JSON
#define USE_STORE_YAML
#define API_MAX_SEND_QUEUE 8
#define MAX_API_CONNECTIONS 6
#define USE_MD5

View File

@@ -27,6 +27,9 @@ smpclient==6.0.0
requests==2.34.2
py7zr==1.1.0
# zstd compression for store_yaml component (stdlib in 3.14+)
backports.zstd==1.5.0; python_version < "3.14"
# esp-idf >= 5.0 requires this
pyparsing >= 3.3.2

View File

@@ -0,0 +1,4 @@
api:
store_yaml:
allow_unencrypted: true

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,3 @@
<<: !include common.yaml
network:

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,5 @@
wifi:
ssid: MySSID
password: password1
<<: !include common.yaml

View File

@@ -0,0 +1,14 @@
esphome:
name: store-yaml-test
areas:
- id: living_room
name: "Living Room"
host:
logger:
api:
store_yaml:
allow_unencrypted: true

View File

@@ -0,0 +1,227 @@
"""End-to-end test for the `store_yaml` recovery flow over the native API.
Talks plaintext API to a host build directly via asyncio sockets rather than
through aioesphomeapi: the released aioesphomeapi shipped with this PR does
not yet know about `GetYamlRequest` / `GetYamlResponse`, so the high-level
client would silently drop the streamed bytes as "unknown message type".
The raw client implements just enough of the plaintext framing
(``0x00 | varint(size) | varint(msg_type) | payload``, see
``api_frame_helper_plaintext.cpp``) to send the empty `GetYamlRequest`
(message type 149) and accumulate every `GetYamlResponse` (message type 150)
until ``done=true``.
"""
from __future__ import annotations
import asyncio
import contextlib
import struct
import pytest
try:
from compression import zstd # type: ignore[import-not-found]
except ImportError:
from backports import zstd # type: ignore[import-not-found, no-redef]
from .types import RunCompiledFunction
# Message IDs from esphome/components/api/api.proto.
HELLO_REQUEST = 1
HELLO_RESPONSE = 2
GET_YAML_REQUEST = 149
GET_YAML_RESPONSE = 150
ENVELOPE_MAGIC = b"EHY1"
def _encode_varint(value: int) -> bytes:
"""Encode an unsigned integer as a protobuf varint."""
out = bytearray()
while True:
byte = value & 0x7F
value >>= 7
if value:
out.append(byte | 0x80)
else:
out.append(byte)
return bytes(out)
def _read_varint(buf: bytes, pos: int) -> tuple[int, int]:
result = 0
shift = 0
while True:
b = buf[pos]
pos += 1
result |= (b & 0x7F) << shift
if not (b & 0x80):
return result, pos
shift += 7
def _parse_get_yaml_response(payload: bytes) -> tuple[bytes, bool, int, str]:
"""Hand-rolled parser for `GetYamlResponse`.
Returns ``(data, done, total_size, encoding)``.
"""
data = b""
done = False
total_size = 0
encoding = ""
pos = 0
while pos < len(payload):
tag, pos = _read_varint(payload, pos)
field_number = tag >> 3
wire_type = tag & 0x07
if wire_type == 0: # varint
value, pos = _read_varint(payload, pos)
if field_number == 2:
done = bool(value)
elif field_number == 3:
total_size = value
elif wire_type == 2: # length-delimited
length, pos = _read_varint(payload, pos)
chunk = payload[pos : pos + length]
pos += length
if field_number == 1:
data = chunk
elif field_number == 4:
encoding = chunk.decode("utf-8")
else:
raise AssertionError(f"unexpected wire type {wire_type}")
return data, done, total_size, encoding
def _unpack_envelope(blob: bytes) -> dict[str, bytes]:
"""Inverse of `_pack_envelope` in `esphome/components/store_yaml/__init__.py`."""
assert blob[:4] == ENVELOPE_MAGIC, "envelope must start with EHY1 magic"
pos = 4
(count,) = struct.unpack_from("<I", blob, pos)
pos += 4
files: dict[str, bytes] = {}
for _ in range(count):
(path_len,) = struct.unpack_from("<H", blob, pos)
pos += 2
path = blob[pos : pos + path_len].decode("utf-8")
pos += path_len
(content_len,) = struct.unpack_from("<I", blob, pos)
pos += 4
content = blob[pos : pos + content_len]
pos += content_len
files[path] = content
assert pos == len(blob), "envelope must consume all bytes"
return files
class _PlaintextClient:
"""Just-enough plaintext API client for one short streaming exchange."""
def __init__(
self, reader: asyncio.StreamReader, writer: asyncio.StreamWriter
) -> None:
self._reader = reader
self._writer = writer
async def send(self, msg_type: int, payload: bytes = b"") -> None:
# Frame: 0x00 | varint(payload_size) | varint(message_id) | payload
frame = (
b"\x00" + _encode_varint(len(payload)) + _encode_varint(msg_type) + payload
)
self._writer.write(frame)
await self._writer.drain()
async def recv(self) -> tuple[int, bytes]:
# Read preamble byte (must be 0x00 for plaintext).
preamble = await self._reader.readexactly(1)
assert preamble == b"\x00", f"unexpected preamble {preamble!r}"
async def _read_varint_stream() -> int:
result = 0
shift = 0
while True:
byte = (await self._reader.readexactly(1))[0]
result |= (byte & 0x7F) << shift
if not (byte & 0x80):
return result
shift += 7
payload_size = await _read_varint_stream()
msg_type = await _read_varint_stream()
payload = await self._reader.readexactly(payload_size) if payload_size else b""
return msg_type, payload
@pytest.mark.asyncio
async def test_store_yaml_recovery(
yaml_config: str,
run_compiled: RunCompiledFunction,
unused_tcp_port: int,
) -> None:
"""Compile a host build with `store_yaml`, ask it to stream the YAML back,
decompress, and verify the recovered file tree matches the source fixture."""
async with run_compiled(yaml_config):
# Open a raw TCP connection to the API server.
reader, writer = await asyncio.wait_for(
asyncio.open_connection("127.0.0.1", unused_tcp_port),
timeout=10.0,
)
client = _PlaintextClient(reader, writer)
try:
# HelloRequest: client_info (field 1, length-delimited string).
# Password auth (the old ConnectRequest/Response exchange at message
# IDs 3/4) was removed in 2026.1.0, so a successful HelloResponse is
# all the handshake we need before issuing application requests.
client_info = b"store_yaml integration test"
api_version = b"\x10\x01\x18\x0e" # api_version_major=1, minor=14
hello_payload = (
b"\x0a" + _encode_varint(len(client_info)) + client_info + api_version
)
await client.send(HELLO_REQUEST, hello_payload)
msg_type, _ = await asyncio.wait_for(client.recv(), timeout=5.0)
assert msg_type == HELLO_RESPONSE, f"expected HelloResponse, got {msg_type}"
# The actual request under test.
await client.send(GET_YAML_REQUEST, b"")
chunks: list[bytes] = []
advertised_total: int | None = None
advertised_encoding: str | None = None
done = False
while not done:
msg_type, payload = await asyncio.wait_for(client.recv(), timeout=5.0)
if msg_type != GET_YAML_RESPONSE:
# Tolerate intervening server messages (e.g. pings).
continue
chunk, done, total_size, encoding = _parse_get_yaml_response(payload)
if encoding:
advertised_encoding = encoding
if total_size and advertised_total is None:
advertised_total = total_size
if chunk:
chunks.append(chunk)
finally:
writer.close()
with contextlib.suppress(ConnectionError, OSError):
await writer.wait_closed()
compressed = b"".join(chunks)
assert advertised_encoding == "zstd", (
f"expected encoding 'zstd', got {advertised_encoding!r}"
)
assert advertised_total == len(compressed), (
f"server advertised {advertised_total} bytes but we received {len(compressed)}"
)
envelope = zstd.decompress(compressed)
files = _unpack_envelope(envelope)
assert files, "envelope should contain at least one file"
combined = b"\n".join(files.values())
assert b"store-yaml-test" in combined, (
"expected the fixture's device name to round-trip through the recovery blob"
)
assert b"store_yaml:" in combined, (
"expected the store_yaml config line to be in the recovery blob"
)

View File

@@ -0,0 +1,156 @@
"""Tests for the store_yaml component's file gathering and envelope packing."""
from __future__ import annotations
from pathlib import Path
import struct
import pytest
from esphome.components.store_yaml import (
ENVELOPE_MAGIC,
REDACTED_PLACEHOLDER,
_gather_files,
_pack_envelope,
)
from esphome.core import CORE, EsphomeError
from esphome.yaml_util import DiscoveredYamlFiles
def _unpack_envelope(blob: bytes) -> dict[str, bytes]:
"""Inverse of `_pack_envelope` for assertions in tests."""
assert blob[:4] == ENVELOPE_MAGIC, "envelope must start with EHY1 magic"
pos = 4
(count,) = struct.unpack_from("<I", blob, pos)
pos += 4
files: dict[str, bytes] = {}
for _ in range(count):
(path_len,) = struct.unpack_from("<H", blob, pos)
pos += 2
path = blob[pos : pos + path_len].decode("utf-8")
pos += path_len
(content_len,) = struct.unpack_from("<I", blob, pos)
pos += 4
content = blob[pos : pos + content_len]
pos += content_len
files[path] = content
assert pos == len(blob), "envelope must consume all bytes"
return files
@pytest.fixture
def project(tmp_path: Path) -> Path:
"""Lay out a tiny ESPHome-like project: entry yaml, an include, and a secrets file."""
project_dir = tmp_path / "project"
project_dir.mkdir()
(project_dir / "entry.yaml").write_text("esphome:\n name: test\n")
(project_dir / "wifi.yaml").write_text("ssid: my_ssid\npassword: my_password\n")
(project_dir / "secrets.yaml").write_text("api_key: SUPER_SECRET\n")
return project_dir
@pytest.fixture(autouse=True)
def _reset_core() -> None:
CORE.data.pop("yaml_sources", None)
CORE.config_path = None
yield
CORE.data.pop("yaml_sources", None)
CORE.config_path = None
def _set_sources(project_dir: Path, *names: str, secrets: tuple[str, ...] = ()) -> None:
CORE.config_path = project_dir / "entry.yaml"
files = [project_dir / name for name in names]
secret_paths = {(project_dir / name).resolve() for name in secrets}
CORE.data["yaml_sources"] = DiscoveredYamlFiles(files, secret_paths)
def test_gather_redacts_secrets_by_default(project: Path) -> None:
_set_sources(
project,
"entry.yaml",
"wifi.yaml",
"secrets.yaml",
secrets=("secrets.yaml",),
)
files = dict(_gather_files(include_secrets=False))
assert files["secrets.yaml"] == REDACTED_PLACEHOLDER
assert b"SUPER_SECRET" not in files["secrets.yaml"]
assert files["wifi.yaml"] == (project / "wifi.yaml").read_bytes()
def test_gather_redacts_yml_extension(project: Path) -> None:
yml = project / "secrets.yml"
yml.write_text("api_key: OTHER_SECRET\n")
_set_sources(project, "entry.yaml", "secrets.yml", secrets=("secrets.yml",))
files = dict(_gather_files(include_secrets=False))
assert files["secrets.yml"] == REDACTED_PLACEHOLDER
def test_gather_redacts_secret_symlinked_to_other_name(
project: Path, tmp_path: Path
) -> None:
"""A `secrets.yaml` symlinked to a non-secrets-named target is still redacted
because the un-resolved basename was captured upstream."""
target = tmp_path / "actual_creds.yaml"
target.write_text("api_key: FROM_SYMLINK\n")
link = project / "secrets.yaml"
link.unlink() # remove the regular file laid down by the fixture
link.symlink_to(target)
# Discovery records the un-resolved listener fname under SECRETS_FILES
# but stores the resolved path; mimic that here.
resolved = link.resolve()
CORE.config_path = project / "entry.yaml"
CORE.data["yaml_sources"] = DiscoveredYamlFiles([resolved], {resolved})
files = dict(_gather_files(include_secrets=False))
assert REDACTED_PLACEHOLDER in files.values()
assert b"FROM_SYMLINK" not in b"".join(files.values())
def test_gather_embeds_secrets_when_opted_in(project: Path) -> None:
_set_sources(project, "entry.yaml", "secrets.yaml", secrets=("secrets.yaml",))
files = dict(_gather_files(include_secrets=True))
assert b"SUPER_SECRET" in files["secrets.yaml"]
def test_gather_uses_relative_path_for_external_files(
project: Path, tmp_path: Path
) -> None:
"""Files outside the project root use a ``..``-style relative path so they don't collide."""
sibling = tmp_path / "outside.yaml"
sibling.write_text("foo: bar\n")
CORE.config_path = project / "entry.yaml"
CORE.data["yaml_sources"] = DiscoveredYamlFiles(
[project / "entry.yaml", sibling], set()
)
files = dict(_gather_files(include_secrets=False))
# project root is `tmp_path/project`, sibling is in `tmp_path` so it
# resolves to `../outside.yaml`.
assert "../outside.yaml" in files
def test_gather_raises_when_no_sources(project: Path) -> None:
CORE.config_path = project / "entry.yaml"
with pytest.raises(EsphomeError):
_gather_files(include_secrets=False)
def test_pack_envelope_roundtrip() -> None:
files = [
("entry.yaml", b"esphome:\n name: test\n"),
("wifi.yaml", b"ssid: a\n"),
]
blob = _pack_envelope(files)
assert _unpack_envelope(blob) == dict(files)
def test_pack_envelope_handles_utf8_paths() -> None:
files = [("dossiers/maison.yaml", b"foo: bar\n")]
blob = _pack_envelope(files)
assert _unpack_envelope(blob) == dict(files)
def test_pack_envelope_rejects_overlong_path() -> None:
long_path = "a" * (0xFFFF + 1)
with pytest.raises(EsphomeError):
_pack_envelope([(long_path, b"")])