1. According to the decision framework in Lesson 4, at approximately what corpus size does AlloyDB become less cost-effective than managed vector databases or Vertex AI Vector Search?
Correct. The lesson states AlloyDB is appropriate "for corpora under ~100M vectors" — beyond that, managed vector databases or Vertex AI Vector Search are more cost-effective.
The lesson specifies ~100 million vectors as the threshold beyond which AlloyDB becomes less cost-effective compared to managed vector services.
2. In the RAGAS framework, a high Context Precision score combined with a low Context Recall score indicates what condition?
Correct. High precision = what is retrieved is relevant. Low recall = not everything relevant is being retrieved. This pattern suggests coverage gaps in the corpus or a top-k value that's too small.
Incorrect. High precision with low recall means the retriever finds good chunks when it finds anything, but misses significant portions of the relevant information — pointing to coverage gaps or insufficient top-k.
3. Google Pay's fraud pipeline achieved sub-200 ms global event delivery by using which Pub/Sub delivery mode?
Correct. Google Pay replaced polling with Pub/Sub push subscriptions feeding Dataflow, achieving sub-200 ms P99 global delivery even under 40× traffic spikes.
Incorrect. Google Pay used push subscriptions feeding Dataflow pipelines to achieve sub-200 ms global event delivery.
4. The Vertex AI Ranking API is positioned in the RAG pipeline:
Correct. The Ranking API is a two-stage reranker: fast vector retrieval gets candidates, the Ranking API reorders them accurately, then the top reranked chunks go to the generative model.
The Ranking API operates after initial retrieval but before generation — it reorders retrieved candidates more accurately before the model sees them.
5. A Dataflow session window has a 30-minute gap threshold. A user is active at 10:00, 10:15, and then again at 11:00. How many session windows are created?
Correct. The 45-minute gap between 10:15 and 11:00 exceeds the 30-minute threshold, so the first session closes at 10:15+30min and a new session opens at 11:00.
Incorrect. The gap between 10:15 and 11:00 is 45 minutes — exceeding the 30-minute threshold — so two separate sessions are created.
6. What GCS feature retains prior versions of an object when it is overwritten, enabling agents to audit document history?
Object Versioning retains all prior versions of an object when overwritten or deleted, creating an audit trail agents can inspect.
Object Versioning is the correct feature. Bucket Lock enforces retention policies; UBLA controls IAM; Lifecycle Management demotes storage classes.
7. What is the key operational difference between push and pull Pub/Sub subscriptions for agent endpoints?
Correct. Push subscriptions HTTP-POST to a configured endpoint (ideal for Cloud Run agents that don't poll). Pull subscriptions require the consumer to call Pub/Sub.pull() to receive messages.
Incorrect. The key difference is delivery direction: push sends to endpoint, pull requires the consumer to fetch. Both provide at-least-once delivery.
8. What is "document fingerprinting" designed to prevent in a production RAG pipeline?
Correct. Document fingerprinting (content hashing) detects when source documents have changed since last ingestion. Changed documents trigger re-embedding and vector upsert, preventing content drift where the retriever returns chunks reflecting old policy or outdated facts.
Incorrect. Document fingerprinting specifically addresses content drift: detecting that a document has changed since it was last embedded, and triggering re-ingestion so the vector store reflects current content.
9. What does the outbox pattern achieve in an event-driven agent pipeline?
Correct. The outbox pattern writes intended actions to a transactional table in the same transaction as state changes. A separate worker reads and executes actions, marking them done — guaranteeing exactly-once side effects even under failure.
Incorrect. The outbox pattern writes intended actions to a transactional table, decoupling event processing from action execution to guarantee exactly-once side effects.
10. The Vertex AI Architecture Center reference (2024) found that teams using query rewriting + hybrid retrieval + reranking together achieved what average context recall improvement over baseline dense-only retrieval?
Correct. The Google Cloud Architecture Center's RAG optimization guide documented an average 28 percentage point recall improvement across five customer deployments when all three retrieval improvements were combined.
28 percentage points is the documented figure from Google Cloud's Architecture Center RAG optimization guide, across five enterprise deployments.
11. The task_type parameter RETRIEVAL_DOCUMENT should be assigned to:
Correct. RETRIEVAL_DOCUMENT is for indexed content; RETRIEVAL_QUERY is for user queries. The asymmetry is intentional — it optimizes the embedding space for cross-type matching.
Incorrect. Documents use RETRIEVAL_DOCUMENT; queries use RETRIEVAL_QUERY. Mismatching these is a common bug that silently degrades retrieval quality.
12. Why is chunk overlap (e.g., 50 tokens of overlap between consecutive 512-token chunks) important?
Correct. Without overlap, a sentence split across the boundary of two consecutive chunks might appear in neither in a retrievable form. Overlap ensures boundary-spanning information is captured in at least one chunk.
Incorrect. Overlap ensures information at chunk boundaries appears fully in at least one chunk. A sentence split across a hard boundary with no overlap might appear incomplete in both adjacent chunks.
13. Box's engineering team reported that one change reduced retrieval misses on conversational queries from 31% to 9%. What was it?
Correct. As reported at Google Cloud Next 2024, conversation condensation was Box AI's highest single-impact improvement — resolving the implicit reference problem in conversational follow-up queries.
Incorrect. Box's reported highest-impact change was conversation condensation, rewriting follow-up questions that depend on prior context into standalone retrieval queries.
14. What is the primary advantage of Uniform Bucket-Level Access for document agent pipelines?
Correct. UBLA eliminates the complexity of per-object ACLs, ensuring all access is governed by IAM policies alone — making security auditing reliable and predictable.
UBLA's value is security: it removes per-object ACLs so all access is controlled exclusively through IAM, making permission auditing deterministic.
15. Application Default Credentials (ADC) resolve credentials from which source when an agent runs on Cloud Run?
Correct. ADC automatically uses the service account attached to the Cloud Run service — no credential files or environment variables needed.
Incorrect. ADC on Cloud Run uses the service account attached to the service at deploy time — automatically, without credential files.
16. Which architectural pattern is most appropriate for a Vertex AI agent that needs data from an S3 bucket with strict freshness requirements of under 5 seconds?
Correct. Sub-5-second freshness requires the real-time event bridge pattern.
For sub-5-second freshness, the real-time event bridge (Kinesis → Pub/Sub) is the appropriate choice.
17. Which statement correctly describes HyDE (Hypothetical Document Embeddings)?
Correct. A hypothetical answer naturally uses the vocabulary of real documents, making it closer in embedding space to relevant corpus chunks than a casual user query would be. Published by Luyu Gao et al. (CMU, 2022).
Incorrect. HyDE generates a hypothetical document that would answer the query — using domain vocabulary — then embeds that hypothetical document for ANN search rather than embedding the original query.
18. When should an event-driven agent use a session window rather than a tumbling window?
Correct. Session windows capture a user's continuous behavioral activity (all events until a gap exceeds the threshold) as one coherent unit — ideal for user journey analysis, session-level fraud scoring, and activity pattern detection.
Incorrect. Session windows are chosen when continuous user activity (closed by inactivity gaps) is the meaningful unit of analysis, not for throughput or delivery guarantee reasons.
19. Dataplex's three core capabilities relevant to agentic workflows are:
Correct. Dataplex's agent-relevant capabilities are: (1) metadata catalog — discovering and describing all data assets; (2) data quality rules — scoring asset reliability; (3) lineage tracking — tracing data provenance from source to output.
Incorrect. Dataplex's three agent-relevant capabilities are: unified metadata catalog, data quality rules (with scored checks), and data lineage tracking. It does not handle vector indexing or embedding generation.
20. What file format provides the best query performance for BigQuery Omni against S3 data, and why?
Correct. Parquet's columnar structure enables Omni to read only the queried columns, dramatically reducing scan cost and time.
Parquet is recommended for its columnar structure, which enables column pruning — Omni reads only the columns your query actually references.