Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
9 changes: 6 additions & 3 deletions .github/workflows/run_jupyter_notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,11 @@ jobs:
PYTHONPATH: "${{ github.workspace }}/src"
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
MAXTEXT_REPO_ROOT=$(pwd)
MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/maxtext/examples"
source .venv/bin/activate

export MAXTEXT_REPO_ROOT=$(pwd)
export MAXTEXT_PKG_DIR=$(pwd)/src/maxtext
export MAXTEXT_NOTEBOOKS_ROOT="$MAXTEXT_REPO_ROOT/src/maxtext/examples"

for notebook in "$MAXTEXT_NOTEBOOKS_ROOT"/{sft,rl}*.ipynb; do
filename=$(basename "$notebook")
Expand All @@ -101,7 +104,7 @@ jobs:
echo "Running $filename ..."
echo "------------------------------------------------------"

.venv/bin/papermill "$notebook" "$output_name" -k maxtext_venv
papermill "$notebook" "$output_name" -k maxtext_venv
done
- name: Record Commit IDs
shell: bash
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/run_pathways_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ jobs:
export MAXTEXT_REPO_ROOT=$(pwd)
export MAXTEXT_ASSETS_ROOT=$(pwd)/src/maxtext/assets
export MAXTEXT_TEST_ASSETS_ROOT=$(pwd)/tests/assets
export MAXTEXT_PKG_DIR=$(pwd)/src/MaxText
export MAXTEXT_PKG_DIR=$(pwd)/src/maxtext
# TODO(b/454659463): Enable test_default_hlo_match after volume mount is supported.
.venv/bin/python3 -m pytest ${{ inputs.pytest_addopts }} -v -m "${FINAL_PYTEST_MARKER}" -k "not AotHloIdenticalTest and not CompileThenLoad" --durations=0
env:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/run_tests_against_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ jobs:
export MAXTEXT_REPO_ROOT=$(pwd)
export MAXTEXT_ASSETS_ROOT=$(pwd)/src/maxtext/assets
export MAXTEXT_TEST_ASSETS_ROOT=$(pwd)/tests/assets
export MAXTEXT_PKG_DIR=$(pwd)/src/MaxText
export MAXTEXT_PKG_DIR=$(pwd)/src/maxtext
# omit this libtpu init args for gpu tests
if [ "${{ inputs.device_type }}" != "cuda12" ]; then
export LIBTPU_INIT_ARGS='--xla_tpu_scoped_vmem_limit_kib=65536'
Expand Down
8 changes: 4 additions & 4 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"justMyCode": false,
"python": "python3",
"module": "maxtext.decode",
"args": ["src/MaxText/configs/base.yml",
"args": ["src/maxtext/configs/base.yml",
"run_name=runner_$(date +%Y-%m-%d-%H-%M)",
"base_output_directory=gs://test-maxtext-output",
"dataset_path=gs://test-maxtext-dataset",
Expand All @@ -36,7 +36,7 @@
"justMyCode": false,
"python": "python3",
"module": "maxtext.decode",
"args": ["src/MaxText/configs/base.yml",
"args": ["src/maxtext/configs/base.yml",
"run_name=runner_$(date +%Y-%m-%d-%H-%M)",
"base_output_directory=gs://test-maxtext-output",
"dataset_path=gs://test-maxtext-dataset",
Expand All @@ -52,7 +52,7 @@
"justMyCode": false,
"python": "python3",
"module": "MaxText.train",
"args": ["src/MaxText/configs/base.yml",
"args": ["src/maxtext/configs/base.yml",
"run_name=runner_$(date +%Y-%m-%d-%H-%M)",
"base_output_directory=gs://test-maxtext-output",
"dataset_path=gs://test-maxtext-dataset",
Expand All @@ -68,7 +68,7 @@
"python": "python3",
"module": "maxtext.inference.inference_microbenchmark",
"args": [
"src/MaxText/configs/base.yml",
"src/maxtext/configs/base.yml",
"model_name=llama2-7b",
"tokenizer_path=src/maxtext/assets/tokenizers/tokenizer.llama2",
"weight_dtype=bfloat16",
Expand Down
8 changes: 4 additions & 4 deletions PREFLIGHT.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ Before you run ML workload on Multihost with GCE or GKE, simply apply `bash pref

Here is an example for GCE:
```
bash preflight.sh PLATFORM=GCE && python3 -m MaxText.train src/MaxText/configs/base.yml run_name=$YOUR_JOB_NAME
bash preflight.sh PLATFORM=GCE && python3 -m MaxText.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
```

Here is an example for GKE:
```
bash preflight.sh PLATFORM=GKE && python3 -m MaxText.train src/MaxText/configs/base.yml run_name=$YOUR_JOB_NAME
bash preflight.sh PLATFORM=GKE && python3 -m MaxText.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
```

# Optimization 2: Numa binding (You can only apply this to v4 and v5p)
Expand All @@ -22,14 +22,14 @@ For GCE,
[preflight.sh](https://github.com/google/maxtext/blob/main/preflight.sh) will help you install `numactl` dependency, so you can use it directly, here is an example:

```
bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m MaxText.train src/MaxText/configs/base.yml run_name=$YOUR_JOB_NAME
bash preflight.sh PLATFORM=GCE && numactl --membind 0 --cpunodebind=0 python3 -m MaxText.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
```

For GKE,
`numactl` should be built into your docker image from [maxtext_tpu_dependencies.Dockerfile](https://github.com/google/maxtext/blob/main/dependencies/dockerfiles/maxtext_tpu_dependencies.Dockerfile), so you can use it directly if you built the maxtext docker image. Here is an example

```
bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m MaxText.train src/MaxText/configs/base.yml run_name=$YOUR_JOB_NAME
bash preflight.sh PLATFORM=GKE && numactl --membind 0 --cpunodebind=0 python3 -m MaxText.train src/maxtext/configs/base.yml run_name=$YOUR_JOB_NAME
```

1. `numactl`: This is the command-line tool used for controlling NUMA policy for processes or shared memory. It's particularly useful on multi-socket systems where memory locality can impact performance.
Expand Down
6 changes: 3 additions & 3 deletions benchmarks/api_server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ export HF_TOKEN=<your_hugging_face_token>

The primary way to launch the API server is by using the `start_server.sh` script. This script ensures that the server is run from the project's root directory, which is necessary for the Python interpreter to find all the required modules.

The script takes the path to a base configuration file (e.g., `MaxText/configs/base.yml`) followed by any number of model-specific configuration overrides.
The script takes the path to a base configuration file (e.g., `maxtext/configs/base.yml`) followed by any number of model-specific configuration overrides.

### Benchmarking Configuration

Expand All @@ -56,7 +56,7 @@ Here is an example of how to launch the server with a `qwen3-30b-a3b` model, con
# Make sure you are in the root directory of the maxtext project.

bash benchmarks/api_server/start_server.sh \
MaxText/configs/base.yml \
maxtext/configs/base.yml \
model_name="qwen3-30b-a3b" \
tokenizer_path="Qwen/Qwen3-30B-A3B-Thinking-2507" \
load_parameters_path="<path_to_your_checkpoint>" \
Expand Down Expand Up @@ -135,7 +135,7 @@ CMD="export HF_TOKEN=${HF_TOKEN} && \
pip install --upgrade pip && \
pip install -r benchmarks/api_server/requirements.txt && \
bash benchmarks/api_server/start_server.sh \
MaxText/configs/base.yml \
maxtext/configs/base.yml \
model_name="${MODEL_NAME}" \
tokenizer_path="${TOKENIZER_PATH}" \
load_parameters_path="${LOAD_PARAMETERS_PATH}" \
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/api_server/launch_gke_server.sh.template
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ CMD="export HF_TOKEN=${HF_TOKEN} && \
pip install --upgrade pip && \
pip install -r benchmarks/api_server/requirements.txt && \
bash benchmarks/api_server/start_server.sh \
MaxText/configs/base.yml \
maxtext/configs/base.yml \
model_name=\"${MODEL_NAME}\" \
tokenizer_path=\"${TOKENIZER_PATH}\" \
load_parameters_path=\"${LOAD_PARAMETERS_PATH}\" \
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/api_server/start_server.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
#
# Example:
# bash benchmarks/api_server/start_server.sh \
# MaxText/configs/base.yml \
# maxtext/configs/base.yml \
# model_name="qwen3-30b-a3b" \
# tokenizer_path="Qwen/Qwen3-30B-A3B-Thinking-2507" \
# load_parameters_path="<path_to_your_checkpoint>" \
Expand Down
5 changes: 4 additions & 1 deletion benchmarks/globals.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,10 @@
r if os.path.isdir(os.path.join(r := os.path.dirname(os.path.dirname(__file__)), ".git")) else MAXTEXT_PKG_DIR,
)

# This is the configs root: with "base.yml"; "models/"; &etc.
MAXTEXT_CONFIGS_DIR = os.environ.get("MAXTEXT_CONFIGS_DIR", os.path.join(MAXTEXT_REPO_ROOT, "src", "maxtext", "configs"))

# This is the assets root: with "tokenizers/"; &etc.
MAXTEXT_ASSETS_ROOT = os.environ.get("MAXTEXT_ASSETS_ROOT", os.path.join(MAXTEXT_REPO_ROOT, "src", "maxtext", "assets"))

__all__ = ["MAXTEXT_ASSETS_ROOT", "MAXTEXT_PKG_DIR", "MAXTEXT_REPO_ROOT"]
__all__ = ["MAXTEXT_ASSETS_ROOT", "MAXTEXT_CONFIGS_DIR", "MAXTEXT_PKG_DIR", "MAXTEXT_REPO_ROOT"]
8 changes: 4 additions & 4 deletions benchmarks/maxtext_xpk_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
import omegaconf

import benchmarks.maxtext_trillium_model_configs as model_configs
from benchmarks.globals import MAXTEXT_PKG_DIR
from benchmarks.globals import MAXTEXT_CONFIGS_DIR
from benchmarks.command_utils import run_command_with_updates
import benchmarks.xla_flags_library as xla_flags
from benchmarks.disruption_management.disruption_handler import DisruptionConfig
Expand Down Expand Up @@ -107,7 +107,7 @@ class WorkloadConfig:
generate_metrics_and_upload_to_big_query: bool = True
hardware_id: str = "v6e"
metrics_gcs_file: str = ""
base_config: str = os.path.join(MAXTEXT_PKG_DIR, "configs", "base.yml")
base_config: str = os.path.join(MAXTEXT_CONFIGS_DIR, "base.yml")
topology: str = dataclasses.field(init=False)
num_devices_per_slice: int = dataclasses.field(init=False)
db_project: str = ""
Expand Down Expand Up @@ -354,7 +354,7 @@ def _build_args_from_config(wl_config: WorkloadConfig) -> dict:
"xla_flags": f"'{xla_flags_str}'",
"dataset": dataset,
"run_type": "maxtext-xpk",
"config_file": os.path.join(MAXTEXT_PKG_DIR, "configs", "base.yml"),
"config_file": os.path.join(MAXTEXT_CONFIGS_DIR, "base.yml"),
"topology": wl_config.topology,
"tuning_params": f"'{tuning_params_str}'",
"db_project": wl_config.db_project,
Expand Down Expand Up @@ -440,7 +440,7 @@ def build_user_command(
f"export JAX_PLATFORMS={jax_platforms} &&",
"export ENABLE_PJRT_COMPATIBILITY=true &&",
"export MAXTEXT_ASSETS_ROOT=/deps/src/maxtext/assets MAXTEXT_PKG_DIR=/deps/src/MaxText MAXTEXT_REPO_ROOT=/deps &&"
f'{hlo_dump} python3 -m MaxText.train {os.path.join(MAXTEXT_PKG_DIR, "configs", "base.yml")}',
f'{hlo_dump} python3 -m MaxText.train {os.path.join(MAXTEXT_CONFIGS_DIR, "base.yml")}',
f"{config_tuning_params}",
f"steps={wl_config.num_steps}",
f"model_name={wl_config.model.model_type}",
Expand Down
6 changes: 3 additions & 3 deletions benchmarks/mmlu/mmlu_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,21 +20,21 @@

To run the MMLU benchmark:
# Default is zero-shot prompting
python3 -m benchmarks.mmlu.mmlu_eval src/MaxText/configs/base.yml \
python3 -m benchmarks.mmlu.mmlu_eval src/maxtext/configs/base.yml \
tokenizer_path=src/maxtext/assets/tokenizer_llama3.tiktoken \
load_parameters_path=check_point_path model_name=llama3.1-8b \
max_prefill_predict_length=1024 max_target_length=2048 ici_tensor_parallelism=4 per_device_batch_size=1

# Example of using the prompt_template flag for Chain-of-Thought (CoT) prompting:
python3 -m benchmarks.mmlu.mmlu_eval src/MaxText/configs/base.yml \
python3 -m benchmarks.mmlu.mmlu_eval src/maxtext/configs/base.yml \
tokenizer_path=src/maxtext/assets/tokenizer_llama3.tiktoken \
load_parameters_path=check_point_path model_name=llama3.1-8b \
max_prefill_predict_length=1024 max_target_length=2048 ici_tensor_parallelism=4 per_device_batch_size=1 \
prompt_template="The following are multiple choice questions (with answers) about {subject}.\n\n{question}\n
{choices}\nAnswer: Let's think step by step."

# Example of using the prompt_template flag for 5-shot prompting (replace with actual examples):
python3 -m benchmarks.mmlu.mmlu_eval src/MaxText/configs/base.yml \
python3 -m benchmarks.mmlu.mmlu_eval src/maxtext/configs/base.yml \
tokenizer_path=src/maxtext/assets/tokenizer_llama3.tiktoken \
load_parameters_path=check_point_path model_name=llama3.1-8b \
max_prefill_predict_length=1024 max_target_length=2048 ici_tensor_parallelism=4 per_device_batch_size=1 \
Expand Down
10 changes: 5 additions & 5 deletions docs/guides/checkpointing_solutions/convert_checkpoint.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ export LAZY_LOAD_TENSORS=<Flag to lazy load> # True to use lazy load, False to u
Finally, run below command to complete the conversion

```bash
python3 -m MaxText.utils.ckpt_conversion.to_maxtext MaxText/configs/base.yml \
python3 -m MaxText.utils.ckpt_conversion.to_maxtext maxtext/configs/base.yml \
model_name=${HF_MODEL} \
hf_access_token=${HF_TOKEN} \
base_output_directory=${MODEL_CHECKPOINT_DIRECTORY} \
Expand Down Expand Up @@ -104,7 +104,7 @@ Use the `to_huggingface.py` script to convert a MaxText checkpoint into the Hugg
The following command converts a MaxText checkpoint and saves it locally, to GCS, or uploads it directly to the Hugging Face Hub.

```bash
python3 -m MaxText.utils.ckpt_conversion.to_huggingface src/MaxText/configs/base.yml \
python3 -m MaxText.utils.ckpt_conversion.to_huggingface src/maxtext/configs/base.yml \
model_name=<MODEL_NAME> \
load_parameters_path=<path-to-maxtext-checkpoint> \
base_output_directory=<path-to-save-converted-checkpoint> \
Expand All @@ -131,7 +131,7 @@ To ensure the conversion was successful, you can use the `tests/utils/forward_pa
### Usage

```bash
python3 -m tests.utils.forward_pass_logit_checker src/MaxText/configs/base.yml \
python3 -m tests.utils.forward_pass_logit_checker src/maxtext/configs/base.yml \
tokenizer_path=assets/<tokenizer> \
load_parameters_path=<path-to-maxtext-checkpoint> \
model_name=<MODEL_NAME> \
Expand Down Expand Up @@ -216,8 +216,8 @@ To extend conversion support to a new model architecture, you must define its sp
- In [`utils/param_mapping.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/param_mapping.py), add the `hook_fn` logic (`def {MODEL}_MAXTEXT_TO_HF_PARAM_HOOK_FN`). This is the transformation needed per layer.

2. **Add Hugging Face weights Shape**: In [`utils/hf_shape.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/hf_shape.py), define the tensor shape of Hugging Face format (`def {MODEL}_HF_WEIGHTS_TO_SHAPE`). This is used to ensure the tensor shape is matched after to_huggingface conversion.
1. **Register model key**: In [`utils/utils.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/utils.py), add the new model key in `HF_IDS`.
1. **Add transformer config**: In [`utils/hf_model_configs.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/hf_model_configs.py), add the `transformers.Config` object, describing the Hugging Face model configuration (defined in ['src/MaxText/configs/models'](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/MaxText/configs/models)). **Note**: This configuration must precisely match the MaxText model's architecture.
3. **Register model key**: In [`utils/utils.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/utils.py), add the new model key in `HF_IDS`.
4. **Add transformer config**: In [`utils/hf_model_configs.py`](https://github.com/AI-Hypercomputer/maxtext/blob/main/src/MaxText/utils/ckpt_conversion/utils/hf_model_configs.py), add the `transformers.Config` object, describing the Hugging Face model configuration (defined in ['src/maxtext/configs/models'](https://github.com/AI-Hypercomputer/maxtext/tree/main/src/maxtext/configs/models)). **Note**: This configuration must precisely match the MaxText model's architecture.

Here is an example [PR to add support for gemma3 multi-modal model](https://github.com/AI-Hypercomputer/maxtext/pull/1983)

Expand Down
Loading
Loading