Building Your First Data Source
This is part 2 of a four-part series. Start with Part 1 — Your First Resource. Part 3 adds a provider function. Part 4 adds an ephemeral resource.
In part 1 we built a mycloud_server resource. Now we’ll add a data source — a read-only query that lets Terraform look up a server’s details without owning it.
Data sources follow the same pattern as resources, with two differences:
- No create/update/delete — only
readand_validate_config - Based on
BaseDataSourceinstead ofBaseResource
Add the Data Source
Create my_provider/server_info.py:
from attrs import define
from pyvider.data_sources import register_data_source
from pyvider.data_sources.base import BaseDataSource
from pyvider.resources.context import ResourceContext
from pyvider.schema import PvsSchema, a_str, s_data_source
from my_provider.server import Server # access the in-memory store
@define
class ServerInfoConfig:
server_id: str
@define
class ServerInfoState:
id: str
name: str
status: str
@register_data_source("server_info")
class ServerInfo(BaseDataSource):
config_class = ServerInfoConfig
state_class = ServerInfoState
@classmethod
def get_schema(cls) -> PvsSchema:
return s_data_source({
"server_id": a_str(required=True, description="ID of the server to look up"),
"id": a_str(computed=True, description="Server ID"),
"name": a_str(computed=True, description="Server name"),
"status": a_str(computed=True, description="Server status"),
})
async def _validate_config(self, config: ServerInfoConfig) -> list[str]:
if not config.server_id:
return ["server_id cannot be empty"]
return []
async def read(self, ctx: ResourceContext) -> ServerInfoState | None:
data = Server._servers.get(ctx.config.server_id)
if not data:
return None # Server doesn't exist — Terraform will surface an error
return ServerInfoState(**data)
Register It
Import the data source in my_provider/__init__.py so Pyvider discovers it:
from pyvider.providers import BaseProvider, ProviderMetadata, register_provider
from pyvider.schema import PvsSchema, s_provider
import my_provider.server # registers mycloud_server
import my_provider.server_info # registers mycloud_server_info
@register_provider("mycloud")
class MyCloudProvider(BaseProvider):
def __init__(self) -> None:
super().__init__(
metadata=ProviderMetadata(
name="mycloud",
version="0.1.0",
protocol_version="6",
)
)
@classmethod
def get_schema(cls) -> PvsSchema:
return s_provider({})
Update the Terraform Configuration
Update main.tf to query the server after creating it:
terraform {
required_providers {
mycloud = {
source = "example.com/tutorial/mycloud"
version = "0.1.0"
}
}
}
provider "mycloud" {}
resource "mycloud_server" "web" {
name = "web-01"
}
data "mycloud_server_info" "web" {
server_id = mycloud_server.web.id
}
output "server_id" { value = mycloud_server.web.id }
output "server_name" { value = data.mycloud_server_info.web.name }
output "server_status" { value = data.mycloud_server_info.web.status }
The data source depends on the resource: Terraform will create the server first, then query it. You can also use a data source on its own — for a server created outside Terraform — by passing the ID directly:
data "mycloud_server_info" "existing" {
server_id = "srv-042"
}
Required Methods on Every Data Source
| Method | Purpose |
|---|---|
get_schema | Declares query params (required/optional) and result fields (computed) |
_validate_config | Return a list of errors, or [] to pass |
read | Return the current state, or None if the object doesn’t exist |
That’s all — no _create_apply, _update_apply, or _delete_apply. Data sources are read-only by design.
Next: Part 3 — Adding a Function to generate consistent server names directly in your Terraform expressions.