Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure in column_anomalies test when column_timestamp isn't provided #1554

Open
angeml opened this issue Jun 13, 2024 · 3 comments
Open

Failure in column_anomalies test when column_timestamp isn't provided #1554

angeml opened this issue Jun 13, 2024 · 3 comments
Labels

Comments

@angeml
Copy link

angeml commented Jun 13, 2024

Describe the bug
This line of code is causing failures in Databricks for the column anomalies test when a column_timestamp is missing.

Caused by: org.apache.spark.sql.catalyst.ExtendedAnalysisException: [UNRESOLVED_COLUMN.WITHOUT_SUGGESTION] A column, variable, or function parameter with name `last_session_start_ts` cannot be resolved.  SQLSTATE: 42703; line 30 pos 19
	at org.apache.spark.sql.catalyst.ExtendedAnalysisException.copyPlan(ExtendedAnalysisException.scala:91)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.$anonfun$execute$1(SparkExecuteStatementOperation.scala:688)

I've resolved this locally by adding a timestamp_column but others might not have that option.

To Reproduce

  1. Create a column_anomalies test for a model that doesn't have a timestamp_column
  2. Run test on Databricks
  3. Observe that the extra , at the end of start_bucket_in_data causes an issue with Databricks

Expected behavior
This test should not produce an error.

Screenshots
Screenshot 2024-06-13 at 12 46 17 PM

Screenshot 2024-06-13 at 12 47 01 PM

Environment (please complete the following information):

  • Elementary CLI (edr) version: 0.15.1, can be found by running pip show elementary-data
  • Elementary dbt package version: 0.15.2, can be found in packages.yml file
  • dbt version you're using 1.7.1
  • Data warehouse : Databricks
  • Infrastructure details prod

Additional context
Slack - https://elementary-community.slack.com/archives/C02CTC89LAX/p1716306300184349

Would you be willing to contribute a fix for this issue?
For sure 👍 But I think it just needs a comma removal 😄

@angeml angeml added Bug Something isn't working Triage 👀 labels Jun 13, 2024
@angeml
Copy link
Author

angeml commented Jun 14, 2024

Just realized that I likely should have created this issue in the https://github.com/elementary-data/dbt-data-reliability repo

@haritamar
Copy link
Collaborator

Hi @angeml !
Thanks for opening this issue and sorry for the delayed response.
Yes you are absolutely right, it seems this flow was broken and we actually have a PR that fixes it which should be merged in the near future.

@Larissa-Rocha
Copy link

Larissa-Rocha commented Jun 27, 2024

Hi guys! I've been experiencing the same issue in column_anomalies test running in Trino, so looking foward to this solution
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants