Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing issue with UTF-8 conversion in Python3 for GCS uploads #268

Merged

Conversation

bucklander
Copy link

@bucklander bucklander commented Aug 4, 2020

SUMMARY

Fixing issue with UTF-8 conversion in Python3 for GCS uploads (#266)

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

gcp_storage_object

ADDITIONAL INFORMATION

This bug can be resolved by simply changing gcp_storage_object.py#L254 from ...

with open(src, "r") as file_obj: to with open(src, "rb") as file_obj:

Hence, this PR. This small change ensures there's no decoding attempt (or subsequent error) by forcing the file to be read as binary via the open func.

This approach appears similar to download_file's use of the open func at gcp_storage_object.py#L243, where it specifically is configured with the b (binary) mode specifier. Not only does this resolve the issue with uploading binary files, but the change doesn't appear to harm text uploads. Unsure though if this is a safe strategy overall and should be probably be reviewed carefully by Google Ansible module maintainers.

$ ansible-playbook --connection=local --inventory localhost, upload-text-file.yml -e ansible_python_interpreter=`which python2`


PLAY [Upload a file to GCS using gcp_storage_object module] ****************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************
ok: [localhost]

TASK [Upload Latest Backup(s) to GCS Bucket] *******************************************************************************************************************************************************
changed: [localhost]

PLAY RECAP *****************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
$ ansible-playbook --connection=local --inventory localhost, upload-text-file.yml -e ansible_python_interpreter=`which python3`


PLAY [Upload a file to GCS using gcp_storage_object module] ****************************************************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************
ok: [localhost]

TASK [Upload Latest Backup(s) to GCS Bucket] *******************************************************************************************************************************************************
changed: [localhost]

PLAY RECAP *****************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
$ ansible-playbook --connection=local --inventory localhost, upload-bin-file.yml -e ansible_python_interpreter=`which python2`

PLAY [Upload a binary file to Google Cloud Storage bucket using gcp_storage_object module] *********************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************
ok: [localhost]

TASK [Upload Latest Backup(s) to GCS Bucket] *******************************************************************************************************************************************************
changed: [localhost]

PLAY RECAP *****************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
$ ansible-playbook --connection=local --inventory localhost, upload-bin-file.yml -e ansible_python_interpreter=`which python3`

PLAY [Upload a binary file to Google Cloud Storage bucket using gcp_storage_object module] *********************************************************************************************************

TASK [Gathering Facts] *****************************************************************************************************************************************************************************
ok: [localhost]

TASK [Upload Latest Backup(s) to GCS Bucket] *******************************************************************************************************************************************************
changed: [localhost]

PLAY RECAP *****************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

@bucklander
Copy link
Author

bucklander commented Aug 5, 2020

Additionally, according to this Google Cloud doc, by using curl's --data-binary arg the preference appears to be to read and use the given file in binary exactly as given (I.e. no conversion) before attempting upload:

curl -X POST --data-binary @[OBJECT_LOCATION] \
-H "Authorization: Bearer [OAUTH2_TOKEN]" \
-H "Content-Type: [OBJECT_CONTENT_TYPE]" \
"https://storage.googleapis.com/upload/storage/v1/b/[BUCKET_NAME]/o?uploadType=media&name=[OBJECT_NAME]"

It'd seem that this change introduces consistent behavior.

@bucklander
Copy link
Author

giphy

@alvaro-gh
Copy link

I think I spent two hours until finding this. Good old Ansible.

@bucklander
Copy link
Author

bucklander commented Oct 11, 2020

I've since given up here, doesn't seem to me that Google Cloud is investing much in their Ansible module collection. If anyone else runs across this issue, I recommend interacting with Google's Cloud APIs as much as possible simply using Ansible's URI module instead.

Simple play set example for GCE instance uploading to GCS bucket:

- name: Retrieve GCS Access Token
        uri:
          url: http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token
          method: GET
          headers:
            Metadata-Flavor: Google
          return_content: yes
        run_once: true
        register: gcs_token_resp

- name: Upload Tarball to GCS Bucket
        uri:
          url: https://storage.googleapis.com/upload/storage/v1/b/{{ gcs_bucket_name }}/o?uploadType=media&name={{ filename }}.tar
          timeout: 720
          method: POST
          src: {{ filename }}.tar
          remote_src: yes
          return_content: yes
          headers:
            Authorization: Bearer {{ gcs_token_resp.json.access_token }}
            Content-Type: application/x-tar
        ignore_errors: yes

@dperezro
Copy link

dperezro commented Nov 3, 2020

Hello! When will this be merged?

@dperezro
Copy link

7940f72cb3c1cb274c191214c1e1dee4

@levonet
Copy link

levonet commented Jan 6, 2021

Maybe there is some alternative to google.cloud collection?

@dperezro
Copy link

@levonet there is... but it's a shame they do not fix this and abandon it.

@dalbani
Copy link

dalbani commented Jan 12, 2021

@dperezro
What is the alternative that you known of?

@levonet
Copy link

levonet commented Jan 12, 2021

ill-fork-my-own-repo-with-blackjack-and-hookers

@dperezro
Copy link

@dperezro
What is the alternative that you known of?

Google's Cloud APIs, using gcloud commands (although it's interactive so you need to run gcloud login first) or make this correction locally (the PR change) :-\

@ikauzak
Copy link

ikauzak commented Jan 18, 2021

Still facing the same issue. Hope someone fixes this asap D:

@bucklander
Copy link
Author

hail marry attempt for merge - @rambleraptor

@levonet
Copy link

levonet commented Jun 3, 2021

I finally realized it was a conspiracy between google and ansible. The purpose of the conspiracy is to migrate everyone to k8s, helm, and terraform. It happened to me :/

@alvaro-gh
Copy link

@rambleraptor mentioning you since you seem to be main contributor. Any chance this can be merged? Pretty please?

@nkvojvodic
Copy link

I find GCP's lack of interest in merging simple, badly needed fixes...disturbing.

@rmoriar1
Any chance of getting this merged?

@rmoriar1 rmoriar1 merged commit f9c8c18 into ansible-collections:master Sep 13, 2021
@alvaro-gh
Copy link

Thanks for merging that, I noticed that the last release for this collection was on February 2020. I spent all day doing a manual fix to get this specific commit as a requirement in AWX, turns out that AWX with Ansible 2.9 can't use requirements.yml with a git version.
So, here's hoping for a 1.0.3 release.
Thanks again, I first noticed this issue on 09/10/2021.

@bucklander bucklander deleted the bwallander_issue_266 branch September 22, 2021 19:27
@jessequinn
Copy link

Should help someone

requirements.yml

collections:
  - name: git+https://github.com/ansible-collections/google.cloud.git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants