This is an introduction to Google Cloud Storage service. Google Cloud Storage provides options for users to store large amount of files on the Google network. Google has worldwide network and data center that can store your files on a distributed platform.
In Google Cloud Storage the folder is called as Bucket. This bucket name should be unique across Google Cloud Storage that is being used by all users. The reason is that when you generate a https link for the bucket it has the bucket name in the url at the very beginning like gs://bucketname. There are no limits to the buckets but each bucket must be associated with a Google Project. There are also no limits on the number of items in a bucket. Buckets cannot be nested. I believe this limitation may be for performance reasons. Versioning can be set at bucket level.
There are three types of buckets in Google storage.
- Standard Storage Buckets – Files that are frequently accessed or currently needed can be stored in Standard Storage. This is a low-latency storage.
- Durable Reduced Availability Buckets – This is the best option when you would like to store archived files in the Google Cloud Storage. It is better to have a job that will move the files that are 30 days or older from standard storage to DRA bucket.
- Nearline Storage Bucket – This has slightly higher latency compared to Standard Storage but is faster than DRA bucket. It is better to have a job that will move the files from Standard Storage to Nearline storage every week and then from Nearline to DRA buckets every month.
The pricing is highest for Standard Storage followed by DRA buckets and Nearline Buckets.
Access control can be provided at project level, bucket level and even file level. There are no limits on the file size. Google captures and provides rich meta data about the files stored and also its state.
To automate the upload, download and deleting of files in buckets Google provides a Python script called gsutil.py. This Python script has various command line options to do various operations. The authentication can be done one time by generating a .BOTO file and storing it on the client system that run the gsutil Python script. gsutil also provide resumable upload and download option.
To share the files with others you can generate a signed url link to the file that has expiration date encrypted to the url. The link will expire on the date specified.
One of the most frequently asked question is whether we can map Google Cloud Storage like a local network drive. As of now it is not possible. I am also expecting this to happen soon. Google provides one option which is it can pull files from your externally available storage via HTTP links.
You can automate the upload and download of files to Google Storage either by using gsutil Python script or Cloud Storage API. Google provides a very good API explorer for all their APIs.
Credits and References
- Cost based on bucket types – Pricing details
- Geographic Location of Buckets – Geo locations
- Object Versioning – Versioning and its details
- API Reference – https://cloud.google.com/storage/docs/json_api/
- GSUTIL reference – https://cloud.google.com/storage/docs/getting-started-gsutil