Getting Set Up for Spinnaker Development on AWS
This guide describes how a developer can set up a Local Git deployment of Spinnaker on Amazon EC2.
In this setup, Spinnaker runs on an EC2 instance. Code is edited on the machine and then synced to the EC2 instance for testing.
Configure local machine
- Create forks of all the Spinnaker microservices on GitHub.
- Clone them onto your local machine in a dedicated
/spinnaker
directory. This will simplify syncing them in a batch during development. - Set up Intellij for Spinnaker as specified.
Configure development region
Provision some roles for Spinnaker. The specific permissions necessary for each role may vary depending on what AWS providers you wish to use, but in general you will need:
SpinnakerDevInstanceRole
: The instance role to be used by your Spinnaker developer instance.- Trust policy should include
ec2.amazonaws.com
- Attached policy permissions should include:
- Amazon S3 full access (for persistent storage)
ec2:DescribeRegions
+ec2:DescribeAvailabilityZones
- The ability to assume the
SpinnakerManagedRole
(described below) - Read-only access to Amazon Elastic Container Registry, if using private docker images.
- Trust policy should include
SpinnakerManagedRole
: The role Spinnaker will assume to interact with your AWS resources. See the AWS Setup Guide for more details.
In the AWS console, in the region where you want your development instance to reside, create a key pair .
In the default VPC of the same region, create a security group named “SpinnakerDev” and add a rule allowing port 22 inbound for a restricted set of IPs (for example, corporate firewall ranges).
Provision a development instance that uses these resources with the following steps:
From the AWS Console, navigate to the EC2 launch instance wizard
Select an Ubuntu Server 22.04 Amazon machine image
Select an instance type with sufficient resources to run all the Spinnaker microservices, such as m5.4xlarge
On the “instance details” step:
- select the default VPC for “Network”
- select the previously created
SpinnakerDevInstanceRole
for “IAM role” - under “Advanced Details”, add the following user data which will install some dependencies for your Local Git deployment :
#!/bin/bash set -ex # Install dependencies for localgit installation add-apt-repository ppa:openjdk-r/ppa apt-get update apt-get -y install git curl netcat redis-server openjdk-8-jdk emacs awscli python2.7 python-pip
Increase the storage snapshot size to ~100GB
For tags, add key: “Name” with value: “SpinnakerDev”, to help identify it after creation
Select the previously created
SpinnakerDev
security group you just createdClick “Launch”, select the key pair you just created (or another you can use for SSH), and hit “Launch Instances”
Configure Development EC2 Instance
Get your instance’s DNS name and login via SSH .
ssh -A -L 9000:localhost:9000 -L 8084:localhost:8084 -L 8087:localhost:8087 ubuntu@$SPINNAKER_INSTANCE_DNS -i /path/to/my-key-pair.pem
Follow the GitHub instructions to generate a new SSH key on the Spinnaker instance and add it to your GitHub account .
Install dependencies:
curl -o- https://raw.githubusercontent.com/creationix/nvm/v0.33.11/install.sh | bash
source ~/.bashrc
nvm install stable
npm install -g yarn
curl -O https://raw.githubusercontent.com/spinnaker/halyard/master/install/debian/InstallHalyard.sh
sudo bash InstallHalyard.sh
source ~/.bashrc
hal -v
Configure and deploy the Spinnaker installation:
# Store state in S3 and deploy a recent stable version
hal config storage s3 edit --region $AWS_REGION # use region where your dev instance resides
hal config storage edit --type s3
hal config version edit --version 1.21.0 # or whichever is latest
hal deploy apply
sudo service apache2 stop
sudo systemctl disable apache2
# Workaround: https://github.com/spinnaker/spinnaker/issues/4041
echo > ~/.hal/default/profiles/settings-local.js
# Clone repos from your GitHub account
hal config deploy edit --type localgit --git-origin-user={your github username}
hal config version edit --version branch:master
# Connect your AWS account
## Replace the account id specified below (123456789012) with your own account id.
hal config provider aws account add my-aws-devel-acct \
--account-id 123456789012 \
--assume-role role/SpinnakerManaged
hal config provider aws account edit my-aws-devel-acct --regions $AWS_REGIONS # regions you want Spinnaker to deploy to
hal config provider aws enable
# Connect your Docker registries
hal config provider docker-registry enable
hal config provider docker-registry account add my-dockerhub-devel-acct \
--address index.docker.io \
--repositories {your dockerhub username}/{your dockerhub repository} \
--username {your dockerhub username} \
--password \
--track-digests true
# Replace the address specified below (123456789012.dkr.ecr.us-west-2.amazonaws.com) with the address to your own ecr registry.
hal config provider docker-registry account add my-us-west-2-devel-registry \
--address 123456789012.dkr.ecr.us-west-2.amazonaws.com \
--username AWS \
--password-command "aws --region us-west-2 ecr get-authorization-token --output text --query 'authorizationData[].authorizationToken' | base64 -d | sed 's/^AWS://'" \
--track-digests true
Optionally connect Amazon ECS to your AWS account:
hal config provider ecs account add ecs-my-aws-devel-acct --aws-account my-aws-devel-acct
hal config provider ecs enable
Deploy everything:
hal deploy apply
Wait for Clouddriver to start up by checking the logs in ~/dev/spinnaker/logs/clouddriver.log.
# Clouddriver start up msg displays after all gradle tasks are run
...
> Task :clouddriver-web:run
___ __
/\_ \ /\ \
___\//\ \ ___ __ __ \_\ \
/'___\\ \ \ / __`\/\ \/\ \ /'_` \
/\ \__/ \_\ \_/\ \L\ \ \ \_\ \/\ \L\ \
\ \____\/\____\ \____/\ \____/\ \___,_\
\/____/\/____/\/___/ \/___/ \/__,_ /
__
/\ \ __
\_\ \ _ __ /\_\ __ __ __ _ __
/'_` \/\`'__\/\ \/\ \/\ \ /'__`\/\`'__\
/\ \L\ \ \ \/ \ \ \ \ \_/ |/\ __/\ \ \/
\ \___,_\ \_\ \ \_\ \___/ \ \____\\ \_\
\/__,_ /\/_/ \/_/\/__/ \/____/ \/_/
2020-06-04 21:09:28.940 INFO 29524 --- [ main] b.c.PropertySourceBootstrapConfiguration : Located property source: CompositePropertySo
urce {name='configService', propertySources=[]}
2020-06-04 21:09:28.946 INFO 29524 --- [ main] com.netflix.spinnaker.clouddriver.Main : The following profiles are active: composite
,test,local
...
Daily development
Test Spinnaker Code Changes
Sync your changes to the development instance with rsync
:
rsync --progress -a -e "ssh -i /path/to/my-key-pair.pem" --exclude='*/build/' --exclude='*/.idea/' --exclude='*/out/' --exclude='*/.gradle/' ~/code/spinnaker/ ubuntu@$SPINNAKER_INSTANCE_DNS:/home/ubuntu/dev/spinnaker
# Optional:
ssh ubuntu@$SPINNAKER_INSTANCE_DNS 'for i in ~/dev/spinnaker/*; do (cd $i && echo $i && git checkout master && git clean -fdx); done'
- If
rsync
isn’t an option, you can also push your changes to a remote branch of your fork(s) and manually pull them down to your dev instance before deploying.
Login to the instance , deploy the changes, and check for build or service failures:
ssh -A -L 9000:localhost:9000 -L 8084:localhost:8084 -L 8087:localhost:8087 ubuntu@$SPINNAKER_INSTANCE_DNS -i /path/to/my-key-pair.pem
hal deploy apply
# or for individual service changes: hal deploy apply --service-names=clouddriver,deck,{other_service_with_changes}
cd ~/dev/spinnaker/logs
Test your changes manually at http://localhost:9000.
Regularly sync from upstream
Add the following to your local machine’s .bashrc file
sync-from-upstream() {
for i in ./*; do
(cd $i && echo $i && git checkout master && git pull --rebase upstream master && git push origin upstream/master:master)
done
}
Regularly run sync-from-upstream
in your /spinnaker
directory to keep your local repos and GitHub forks in sync with upstream Spinnaker. Syncing daily or every couple of days will ensure you’re working off the latest code for each service.
Local development with spinnaker/deck
To expedite development of
Deck
or to add ad-hoc console.log
statements for debugging, you can run the web app on your local machine and connect to the services on your development instance over SSH.
After forking and pulling down Deck locally, install dependencies with
yarn
(see README )Run Deck with
yarn run start
- Windows users can circumvent the bash start up script by running it directly with npm:
npm run start-dev-server
- Windows users can circumvent the bash start up script by running it directly with npm:
Open a separate terminal and SSH into your development instance :
ssh -A -L 8084:localhost:8084 -L 8087:localhost:8087 ubuntu@$SPINNAKER_INSTANCE_DNS -i /path/to/my-key-pair.pem
- Access local Deck on
localhost:9000
. Changes made & saved to your local app will prompt the app to refresh.
NOTE: feature flags that would be set as environment variables on your development instance can be manually turned on/off in local deck by setting them in <code>settings.js</code> .
Troubleshooting
Below are some issues you may encounter when running or setting up your dev instance, and how you might remedy them.
“Pool Not Open”
Example exception found in Clouddriver log:
Example Exception (Found in the Clouddriver log file)
2019-08-14 19:30:34.381 ERROR 24799 --- [gentScheduler-1] c.n.s.c.r.c.ClusteredAgentScheduler : Unable to run agents
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at redis.clients.util.Pool.getResource(Pool.java:53) ~[jedis-2.9.3.jar:na]
at redis.clients.jedis.JedisPool.getResource(JedisPool.java:226) ~[jedis-2.9.3.jar:na]
at com.netflix.spinnaker.kork.jedis.telemetry.InstrumentedJedisPool.getResource(InstrumentedJedisPool.java:60) ~[kork-jedis-5.11.1.jar:5.11.1]
at com.netflix.spinnaker.kork.jedis.telemetry.InstrumentedJedisPool.getResource(InstrumentedJedisPool.java:26) ~[kork-jedis-5.11.1.jar:5.11.1]
at com.netflix.spinnaker.kork.jedis.JedisClientDelegate.withCommandsClient(JedisClientDelegate.java:45) ~[kork-jedis-5.11.1.jar:5.11.1]
at com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.acquireRunKey(ClusteredAgentScheduler.java:178) ~[cats-redis.jar:na]
at com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.acquire(ClusteredAgentScheduler.java:131) ~[cats-redis.jar:na]
at com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.runAgents(ClusteredAgentScheduler.java:158) ~[cats-redis.jar:na]
at com.netflix.spinnaker.cats.redis.cluster.ClusteredAgentScheduler.run(ClusteredAgentScheduler.java:151) ~[cats-redis.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_222]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_222]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_222]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_222]
Caused by: java.lang.IllegalStateException: Pool not open
...
It’s likely that there is another exception that has occurred when Clouddriver started up (that can be found near the top of the log file). This means that Clouddriver failed to start up successfully, and you will see this stacktrace in the logs for every second (or the defined the polling frequency) that Clouddriver is running with the initial exception.
- Check the top of the log file (if the file is too large, you can use the
head
command to check the top of the file):
head -{number of lines} {log file}
# For instance, if you're looking up the first 1000 lines of the Clouddriver log file.
head -1000 dev/spinnaker/logs/clouddriver.log
- The first exception that you see (e.g., failed API call or authentication issue) is likely the root cause.
“Address already in use”
Example exception found in Clouddriver log:
org.apache.catalina.LifecycleException: Protocol handler start failed
at org.apache.catalina.connector.Connector.startInternal(Connector.java:1008) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.catalina.core.StandardService.addConnector(StandardService.java:227) [tomcat-embed-core-9.0.21.jar:9.0.21]
at org.springframework.boot.web.embedded.tomcat.TomcatWebServer.addPreviouslyRemovedConnectors(TomcatWebServer.java:263) [spring-boot-2.1.6.RELEASE.jar:2.1.6.RELEASE]
at org.springframework.boot.web.embedded.tomcat.TomcatWebServer.start(TomcatWebServer.java:195) [spring-boot-2.1.6.RELEASE.jar:2.1.6.RELEASE]
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.startWebServer(ServletWebServerApplicationContext.java:296) [spring-boot-2.1.6.RELEASE.jar:2.1.6.RELEASE]
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.finishRefresh(ServletWebServerApplicationContext.java:162) [spring-boot-2.1.6.RELEASE.jar:2.1.6.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:552) [spring-context-5.1.8.RELEASE.jar:5.1.8.RELEASE]
...
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_222]
at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_222]
at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_222]
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) ~[na:1.8.0_222]
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ~[na:1.8.0_222]
at org.apache.tomcat.util.net.NioEndpoint.initServerSocket(NioEndpoint.java:230) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.tomcat.util.net.NioEndpoint.bind(NioEndpoint.java:213) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.tomcat.util.net.AbstractEndpoint.bindWithCleanup(AbstractEndpoint.java:1124) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.tomcat.util.net.AbstractEndpoint.start(AbstractEndpoint.java:1210) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.coyote.AbstractProtocol.start(AbstractProtocol.java:585) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
at org.apache.catalina.connector.Connector.startInternal(Connector.java:1005) ~[tomcat-embed-core-9.0.21.jar:9.0.21]
... 17 common frames omitted
If you get this exception when running hal deploy apply
, check that you don’t already have an instance of that microservice currently running (perhaps as root or another user). Check that the
spinnaker microservice ports
are not already in use.
The following would be an example of how to debug this exception for Clouddriver (port 7002):
- Check if the port for the given microservice is in use, and make a note of the process id (pid) that is using it. For instance for Clouddriver, you would do the following:
sudo netstat -plnt | grep {port}
# For CloudDriver
sudo netstat -plnt | grep 7002
- Using the process id (pid) determined above, check what process is using the open port:
ps -ef | grep {pid}
# For instance, if the process id was 12345
ps -ef | grep 12345
- If the process is another instance of the Spinnaker microservice that you’re trying to start, you can kill the process
kill -9 {pid}
# For instance, if the process id was 12345
kill -9 12345
- Try re-running
hal deploy apply