--accelerator <ACCELERATOR> | Manage the accelerator config for GPU serving. When deploying a model with
Compute Engine Machine Types, a GPU accelerator may also
be selected.
+
*type*::: The type of the accelerator. Choices are 'nvidia-tesla-a100', 'nvidia-tesla-k80', 'nvidia-tesla-p100', 'nvidia-tesla-p4', 'nvidia-tesla-t4', 'nvidia-tesla-v100'.
+
*count*::: The number of accelerators to attach to each machine running the job.
If not specified, the default value is 1. Your model must be specially designed
to accommodate more than 1 accelerator per machine. To configure how many
replicas your model has, set the `manualScaling` or `autoScaling`
parameters |
--account <ACCOUNT> | Google Cloud Platform user account to use for invocation. Overrides the default *core/account* property value for this command invocation |
--async | Return immediately, without waiting for the operation in progress to
complete |
--billing-project <BILLING_PROJECT> | The Google Cloud Platform project that will be charged quota for operations performed in gcloud. If you need to operate on one project, but need quota against a different project, you can use this flag to specify the billing project. If both `billing/quota_project` and `--billing-project` are specified, `--billing-project` takes precedence. Run `$ gcloud config set --help` to see more information about `billing/quota_project` |
--config <CONFIG> | Path to a YAML configuration file containing configuration parameters
for the
[Version](https://cloud.google.com/ai-platform/prediction/docs/reference/rest/v1/projects.models.versions)
to create.
+
The file is in YAML format. Note that not all attributes of a version
are configurable; available attributes (with example values) are:
+
description: A free-form description of the version.
deploymentUri: gs://path/to/source
runtimeVersion: '2.1'
# Set only one of either manualScaling or autoScaling.
manualScaling:
nodes: 10 # The number of nodes to allocate for this model.
autoScaling:
minNodes: 0 # The minimum number of nodes to allocate for this model.
labels:
user-defined-key: user-defined-value
+
The name of the version must always be specified via the required
VERSION argument.
+
Only one of manualScaling or autoScaling can be specified. If both
are specified in same yaml file an error will be returned.
+
If an option is specified both in the configuration file and via
command-line arguments, the command-line arguments override the
configuration file |
--configuration <CONFIGURATION> | The configuration to use for this command invocation. For more
information on how to use configurations, run:
`gcloud topic configurations`. You can also use the CLOUDSDK_ACTIVE_CONFIG_NAME environment
variable to set the equivalent of this flag for a terminal
session |
--description <DESCRIPTION> | Description of the version |
--flags-file <YAML_FILE> | A YAML or JSON file that specifies a *--flag*:*value* dictionary.
Useful for specifying complex flag values with special characters
that work with any command interpreter. Additionally, each
*--flags-file* arg is replaced by its constituent flags. See
$ gcloud topic flags-file for more information |
--flatten <KEY> | Flatten _name_[] output resource slices in _KEY_ into separate records
for each item in each slice. Multiple keys and slices may be specified.
This also flattens keys for *--format* and *--filter*. For example,
*--flatten=abc.def* flattens *abc.def[].ghi* references to
*abc.def.ghi*. A resource record containing *abc.def[]* with N elements
will expand to N records in the flattened output. This flag interacts
with other flags that are applied in this order: *--flatten*,
*--sort-by*, *--filter*, *--limit* |
--format <FORMAT> | Set the format for printing command output resources. The default is a
command-specific human-friendly output format. The supported formats
are: `config`, `csv`, `default`, `diff`, `disable`, `flattened`, `get`, `json`, `list`, `multi`, `none`, `object`, `table`, `text`, `value`, `yaml`. For more details run $ gcloud topic formats |
--framework <FRAMEWORK> | ML framework used to train this version of the model. If not specified, defaults to 'tensorflow'. _FRAMEWORK_ must be one of: *scikit-learn*, *tensorflow*, *xgboost* |
--help | Display detailed help |
--impersonate-service-account <SERVICE_ACCOUNT_EMAIL> | For this gcloud invocation, all API requests will be made as the given service account instead of the currently selected account. This is done without needing to create, download, and activate a key for the account. In order to perform operations as the service account, your currently selected account must have an IAM role that includes the iam.serviceAccounts.getAccessToken permission for the service account. The roles/iam.serviceAccountTokenCreator role has this permission or you may create a custom role. Overrides the default *auth/impersonate_service_account* property value for this command invocation |
--labels <KEY=VALUE> | List of label KEY=VALUE pairs to add.
+
Keys must start with a lowercase character and contain only hyphens (`-`), underscores (```_```), lowercase characters, and numbers. Values must contain only hyphens (`-`), underscores (```_```), lowercase characters, and numbers |
--log-http | Log all HTTP server requests and responses to stderr. Overrides the default *core/log_http* property value for this command invocation |
--machine-type <MACHINE_TYPE> | Type of machine on which to serve the model. Currently only applies to online prediction. For available machine types,
see https://cloud.google.com/ai-platform/prediction/docs/machine-types-online-prediction#available_machine_types |
--model <MODEL> | Name of the model |
--origin <ORIGIN> | Location of ```model/``` "directory" (see
https://cloud.google.com/ai-platform/prediction/docs/deploying-models#upload-model).
+
This overrides `deploymentUri` in the `--config` file. If this flag is
not passed, `deploymentUri` *must* be specified in the file from
`--config`.
+
Can be a Cloud Storage (`gs://`) path or local file path (no
prefix). In the latter case the files will be uploaded to Cloud
Storage and a `--staging-bucket` argument is required |
--project <PROJECT_ID> | The Google Cloud Platform project ID to use for this invocation. If
omitted, then the current project is assumed; the current project can
be listed using `gcloud config list --format='text(core.project)'`
and can be set using `gcloud config set project PROJECTID`.
+
`--project` and its fallback `core/project` property play two roles
in the invocation. It specifies the project of the resource to
operate on. It also specifies the project for API enablement check,
quota, and billing. To specify a different project for quota and
billing, use `--billing-project` or `billing/quota_project` property |
--python-version <PYTHON_VERSION> | Version of Python used when creating the version. Choices are 3.7, 3.5, and 2.7.
However, this value must be compatible with the chosen runtime version
for the job.
+
Must be used with a compatible runtime version:
+
* 3.7 is compatible with runtime versions 1.15 and later.
* 3.5 is compatible with runtime versions 1.4 through 1.14.
* 2.7 is compatible with runtime versions 1.15 and earlier |
--quiet | Disable all interactive prompts when running gcloud commands. If input
is required, defaults will be used, or an error will be raised.
Overrides the default core/disable_prompts property value for this
command invocation. This is equivalent to setting the environment
variable `CLOUDSDK_CORE_DISABLE_PROMPTS` to 1 |
--region <REGION> | Google Cloud region of the regional endpoint to use for this command.
If unspecified, the command uses the global endpoint of the AI Platform Training
and Prediction API.
+
Learn more about regional endpoints and see a list of available regions:
https://cloud.google.com/ai-platform/prediction/docs/regional-endpoints
+
_REGION_ must be one of: *asia-east1*, *asia-northeast1*, *asia-southeast1*, *australia-southeast1*, *europe-west1*, *europe-west2*, *europe-west3*, *europe-west4*, *northamerica-northeast1*, *us-central1*, *us-east1*, *us-east4*, *us-west1* |
--runtime-version <RUNTIME_VERSION> | AI Platform runtime version for this job. Must be specified unless --master-image-uri is specified instead. It is defined in documentation along with the list of supported versions: https://cloud.google.com/ai-platform/prediction/docs/runtime-version-list |
--staging-bucket <STAGING_BUCKET> | Bucket in which to stage training archives.
+
Required only if a file upload is necessary (that is, other flags
include local paths) and no other flags implicitly specify an upload
path |
--trace-token <TRACE_TOKEN> | Token used to route traces of service requests for investigation of issues. Overrides the default *core/trace_token* property value for this command invocation |
--user-output-enabled | Print user intended output to the console. Overrides the default *core/user_output_enabled* property value for this command invocation. Use *--no-user-output-enabled* to disable |
--verbosity <VERBOSITY> | Override the default verbosity for this command. Overrides the default *core/verbosity* property value for this command invocation. _VERBOSITY_ must be one of: *debug*, *info*, *warning*, *error*, *critical*, *none* |