How to Connect AWS and Splunk to Ingest Log Data
By: Don Arnold | Splunk Consultant
Though a number of cloud solutions have popped up over the past 10 years, Amazon Web Services, better known as simply AWS, seems to be taking the lead in cloud infrastructure. And, companies that are using AWS have either migrated their entire infrastructure or are using on-premises systems with some AWS services in a hybrid solution. Whichever may be the case, the AWS environment is within the security boundary and should be a part of the System Security Plan (SSP) and needs to include Continuous Monitoring, which is a requirement in most security frameworks. Splunk meets the Continuous Monitoring requirements, which includes instances and services within AWS.
Data push
There are 2 separate ways to get data from AWS into Splunk. The first is to “push” data from AWS using “Kinesis Firehose” to a Splunk. This requires IP connectivity between AWS and a Splunk Heavy Forwarder, a HTTP Event Collector token, and the “Splunk Add-on for Amazon Kinesis Firehose” from Splunkbase.
Splunk Heavy Forwarder Setup
- Ensure the organization firewall has a rule to allow connectivity from AWS to the Splunk Heavy Forwarder over HTTPs.
- Go to Splunkbase.com and download/install the “Splunk Add-on for Amazon Kinesis Firehose” – Restart the Splunk Heavy Forwarder
- Create an HTTP Event Collector token:
- Go to Settings > Data Inputs > HTTP Event Collector
- Select New Token
- Enter a name for your token. Example: “AWS”. Select Next
- For Source type, click Select > Structured and choose “aws:firehose:json”. For App Context choose “Add-on for Kinesis Firehose”. Select Review
- Verify the settings and select
- Go back to Settings > Data Inputs > HTTP Event Collector and select Global Settings
- For “All Tokens” select Enabled, ensure “Enable SSL” is selected, and the “HTTP port number” is set to 8088. Select Save.
- Copy the “Token Value” for setup in AWS Kinesis Firehose.
AWS Kinesis Firehose Setup
- Log in to AWS and go to the Kinesis service and select the “Get Started” button.
- On the top right you will see “Deliver Streaming data with Kinesis Firehose Delivery Streams.” Select the “Create Delivery System” button.
- Give your delivery system a name. Under Source, choose “Direct PUT or other sources”. Select the “Next” button.
- Select “Disabled” for both Data transformation and Record format conversion.
- For Destination select “Splunk”. For Splunk cluster endpoint, enter the URL with port 8088 of your Splunk Heavy Forwarder. For Splunk endpoint type select “Raw endpoint”. For Authentication, token enter the Splunk HTTP Event Collector token number created in the Splunk Heavy Forwarder setup.
- For S3 backup select a S3 bucket. If one does not exist you can create one by selecting “Create New”. Select Next.
- Scroll down to Permissions and click “Create new or choose” button. Choose an existing IAM role or create one. Click Allow to return to the previous menu. Select Next.
- Review the settings and select Create Delivery Stream.
- You will see a message stating “Successfully created delivery stream…”.
Test the Connection
- It is recommended that test data be used to verify the new connection by choosing the delivery stream and selecting “Test with Demo Data”. Go to step 2 and select “Start sending demo data”. You will see the delivery stream sending demo data to Splunk.
- Log into Splunk and enter index=main sourcetype=aws:firehose:json to verify events are streaming into Splunk.
- If no events show up, go back and verify all steps have been configured properly and firewall rules are set to allow AWS HTTPs events through to the Splunk Heavy Forwarder.
Send Production Data
- Go to AWS Kinesis and select the delivery stream your setup. The status for the delivery stream should display “Active”.
- Go to Splunk and verify events are ingesting: index=mainsourcetype=aws:firehose:json and verify the timestamp is correct with the events.
Data pull
The second way to get data into Splunk from AWS is to have Splunk “pull” data via a REST API call.
AWS Prerequisites Setup
- There are AWS service prerequisites that require set up prior to performing REST API calls from the Splunk Heavy Forwarder. The prerequisites can be found in this document: https://docs.splunk.com/Documentation/AddOns/released/AWS/ConfigureAWS
- Ensure all prerequisites are configured in AWS prior to configuring the “Splunk Add-on for AWS” on the Splunk Heavy Forwarder.
Splunk Heavy Forwarder Setup
- Ensure the organization firewall has a rule to allow connectivity from the Splunk Heavy Forwarder to AWS.
- Go to Splunkbase.com and install the “Splunk Add-on for AWS” – Restart the Splunk Heavy Forwarder.
- Launch the “Splunk Add-on for AWS” on the Splunk Heavy Forwarder.
- Go to the Configurations tab.
- Account tab: Select Add. Give the connection a name, enter the Key ID and Secret Key from the AWS IAM user account and select Add.
(To get the Key ID and Secret Key, go to AWS IAM > Access management > Users > (select user) > Security credentials > Create access key > Access Key ID and Secret Access key)
- IAM Role tab: Select Add. Give the Role a name, enter the Role ARN and select Add.
(To get the Role ARN, go to AWS IAM > Access management > Roles > (select role). At the top you will see the Role ARN)
- Go to the Inputs tab. Select Create New Input and select the type of data input from AWS to ingest. Each selection is different and all will use the User and Role created in the previous step. Go through the setup and select the AWS region, source type, and index and select Save.
Test the Connection
- Log into Splunk and enter index=main sourcetype=aws* to verify events are streaming into Splunk. Verify the sourcetype matches the one you selected in the input.
- If no events show up, go back and verify all steps have been configured properly and firewall rules are set to allow AWS HTTPs events through to the Splunk Heavy Forwarder.
With the popularity of AWS, more environments are starting to host hybrid solutions for a myriad of reasons. With that, using Splunk to maintain Continuous Monitoring is easily achieved with 2 different approaches for monitoring the expanded security boundary into the cloud. TekStream Solutions has Splunk AWS engineers on staff with years of experience and can assist you in connecting your AWS environment to Splunk.
References
https://docs.splunk.com/Documentation/AddOns/released/Firehose/About
https://docs.splunk.com/Documentation/AddOns/released/Firehose/ConfigureFirehose
https://docs.splunk.com/Documentation/AddOns/released/AWS/Description
https://docs.splunk.com/Documentation/AddOns/released/AWS/ConfigureAWS
Want to learn more about connecting AWS and Splunk to ingest log data? Contact us today!