LCA2018 Sysadmin Miniconf Presentations

The abstracts for the presentations accepted for the Linux.Conf.Au 2018 Syadmin Miniconf are listed below. A draft presentation schedule is online now.

Presentation Titles

(Links sorted by Presenter's name; Abstracts below are sorted by first name of presenter.)

Becoming the Admiral: mastering Docker orchestration - Alistair Chapman
Fixing tridge's mistakes: Taking Samba AD to scale - Andrew Bartlett
Icinga 2 in a 24/7 Television Broadcast Environment - Dave Kempe
Migrating to the cloud - Devdas Bhagat
Day 2 Operations with Containers: Myth vs. Reality - Elizabeth K. Joseph
User Session Recording for the Enterprise - Fraser Tweedale
Next Generation Config Mgmt: Sysadmins - James Shubin
Puppet in the cloud - Jethro Carr
Monitoring All the Things! on your Linux system with the Elastic Stack - Josh Rich
Revisiting Sysadmin and Wordpress - Scripting/automating - Liz Quilty
MQTT as a Unified Message Bus for Infrastructure Services - Matthew Treinish
The New Old Thing: Dynamic Service Discovery with DNS - Matt Palmer
Designing scalable production Kubernetes clusters on AWS - Nick Young
Funny FOSS war stories from the pages of The Register - Simon Sharwood
Quick Introduction to OpenNMS - Tarus Balog
Cost-Effective Virtual Petabyte Storage Pools - Thomas Schoebel-Theuer
Principles Of Good Monitoring - Troy Lea

Full Abstracts

Becoming the Admiral: mastering Docker orchestration - Alistair Chapman

The transition to containers may be easy enough for developers, but managing and running a fleet of hundreds or thousands of containers can be a daunting change. The “production cliff” of taking Docker from a small proof-of-concept to fully resilient production service is extremely high and often underestimated.

In this talk, I will be discussing the major pitfalls to be aware of when transitioning towards containers in production, simple best practices for running production workloads and how to orchestrate new container infrastructure along with existing virtualised or bare-metal infrastructure.

This won’t just be another “containers are the greatest talk” as I will cover the pitfalls, failings and downsides of running Docker-based container workloads of all types. In addition, we’ll be looking at how to test, migrate, run and monitor production services using orchestrators like Swarm, Kubernetes or OpenShift. Finally, I’ll show you how you can make use of public and hybrid cloud services to make running containerised services easier and get returns sooner.

About Alistair Chapman:

Alistair Chapman is an Australian InfoSec engineer, .NET developer and technical architect. While he’s currently working at Red Hat, he’s also spent years doing everything from network engineering to DevOps consulting, governance research to embedded development. Currently, his passion is security architecture, cross-platform .NET and containerisation, and is a Microsoft .NET MVP. When not at work, Alistair is active in the .NET open-source community including maintaining Cake (a .NET Foundation project).

Fixing tridge's mistakes: Taking Samba AD to scale - Andrew Bartlett

Samba 4.7 brings a massive improvement in stability and scale, and even larger organisations are deploying it regularly.

This talk will update system administrators on the current status in Samba, the locking issues we addressed with Samba 4.7, the new scale of organisation that Samba can now address and the tools we have written to test Samba at scale.

About Andrew Bartlett:

Andrew Bartlett is a Samba Developer working for Catalyst in Wellington, New Zealand.

Andrew has a long history in Samba's Active Directory Domain Controller, being part of the original Samba4 effort since its inception in 2003.

He has been a member of the Samba Team since 2001.

Icinga 2 in a 24/7 Television Broadcast Environment - Dave Kempe

Icinga2 is gaining steam and is much improved compared to it's ancestor's Nagios and Icinga1. I will present some war stories and implementation details from our Icinga deployments into television broadcast environments as a template for anyone using Icinga in critical environments with obscure hardware and software. From plugins we needed to develop, to challenges in effecting change in staff practices I will walk through the projects and share my experiences on the way. We will cover the implementation of distributed monitoring in Icinga2 with strict firewalls, building dashboards using Nagvis and integration of Opsgenie for alerting.

About Dave Kempe:

Dave founded Open Source IT services company Sol1 in 1999. Day to day along with running a thriving IT services company, he builds Open Source infrastructure solutions for customers both large and small. With plenty of war stories to share and experience to give, he is keen to offer some advice and knowledge to new users and veterans alike.

Migrating to the cloud - Devdas Bhagat

This is an experience report of a migration from self-hosted services to running in the cloud. While there have been plenty of business case studies showing the benefits of a cloud migration, there are very few reports on the IT side of the migration.

This talk covers the migration of Spilgames (a small Dutch games publisher) from a self-hosted Openstack and hardware based infrastructure to Google cloud, challenges, tooling (and lack thereof). This migration is still work in progress, and the talk will cover as much detail as possible.

About Devdas Bhagat:

Devdas Bhagat is a long time Linux user, a system administrator, a developer, a DBA and occasional network engineer. He has been a frequent speaker at various conferences in the Asia Pacific region and in the EU on a wide variety of topics.

Day 2 Operations with Containers: Myth vs. Reality - Elizabeth K. Joseph

Containers have been hailed as an easy solution to many problems, from software testing to scaling stateless workloads in production. But anyone can write a deployment tool for a container-based infrastructure, the hard work comes when you get to day 2 and you need to handle the day to day operations and maintenance. Metrics, monitoring, logs, debugging, backups and upgrades are all considerations that systems administrators need to take into account before they invest in a solution.

Through lessons learned from direct experience in operations, as well as feedback from open source DC/OS community members, this talk will pull back the curtain to show the internals of how to handle these day 2 operations. It will also provide a checklist of things you want make sure are included when you build a plan for building and maintaining your infrastructure (hint: Logging should never be an afterthought).

About Elizabeth K. Joseph:

Elizabeth K. Joseph is a Developer Advocate at Mesosphere focused on DC/OS and Apache Mesos. Previously, she spent four years as a systems engineer on the OpenStack Infrastructure team and six years on the Ubuntu Community Council. She is the author of Common OpenStack Deployments (2016) The Official Ubuntu Book, 8th (2014) and 9th (2016) editions. At home in San Francisco, she sits on the Board of Directors for Partimus.org, a non-profit providing Linux-based computers to schools and community centers in need.

User Session Recording for the Enterprise - Fraser Tweedale

For Open Source software to conquer the enterprise, we need to play along with government and industry regulations, and help organisations meet their security and audit requirements. Sometimes this means tracking everything a user sees and does. A flexible and scalable Open Source user session recording solution is needed.

In this presentation we will discuss the limitations of existing Open Source approaches, then present the Scribery project, an end-to-end session recording solution with features including:

terminal session playback and real-time monitoring (including what the user sees)
centralised storage and correlation with auditd log events
centralised control of what or whom to record, via SSSD and in the future FreeIPA
Cockpit integration

The presentation will include a demo of a user session being recorded, stored centrally, inspected and played back.

We will look at the architecture, discuss implementation challenges, and conclude with an overview of the road ahead.

The intended audience is system administrators and security officers responsible for security and compliance, and developers of security, identity and policy management systems.

scribery.github.io

About Fraser Tweedale:

Fraser works at Red Hat on the FreeIPA identity management system and Dogtag Certificate System. He's interested in security, cryptography, functional programming, type theory and theorem proving. Jalapeño aficionado.

Next Generation Config Mgmt: Sysadmins - James Shubin

Next Generation Config Mgmt: Reactive Systems

Mgmt is a next gen config management tool that takes a fresh look at automation.

The main design features of the tool include:

Parallel execution
Event driven mechanism
distributed architecture
Declarative, Functional, Reactive programming language.

The tool has two main parts: the engine, and the language. This presentation will demo both and include many interactive examples showing you how to build reactive, autonomous, real-time systems. Finally we'll talk about some of the future designs we're planning and make it easy for new users to get involved and help shape the project.

A number of blog posts on the subject are available at https://ttboj.wordpress.com/?s=mgmtconfig. Attendees are encouraged to read some before the talk if they want a preview!

About James Shubin:

James is a DevOps/Config mgmt. hacker and physiologist from Montreal, Canada. He often goes by @purpleidea on the internet, and writes "The Technical Blog of James". He works for Red Hat researching and prototyping around automation engineering. He started a Next Generation Config Management project called mgmt. He studied Physiology at university and sometimes likes to talk about cardiology.

Puppet in the cloud - Jethro Carr

Whilst containerisation tech like K8 is increasingly popular, many organisations still need to run large fleets of servers in cloud providers using more traditional configuration management systems.

This talk explains how to implement open source Puppet in a reliable and secure manner in the AWS cloud, including a CI/CD workflow for releasing new configuration, secure autosigning of systems and reliable failure tolerant Puppet masters.

About Jethro Carr:

When not herding cats around his house, Jethro spends his time operating large fleets of GNU/Linux servers and converting coffee to Puppet manifests and CloudFormation stacks.

Monitoring All the Things! on your Linux system with the Elastic Stack - Josh Rich

In this talk, we'll look at how you can easily ingest your Linux system logs and various OS metrics into Elasticsearch using Filebeat and Metricbeat modules. Modules are a new concept in the open-source Filebeat and Metricbeat tools made by Elastic. We can then visually examine both our systems performance and all events occurring on it over time with Kibana. This is a near complete open source monitoring solution for a Linux system.

Assuming the demo gods allow, We'll have a little bit of a play with our systems, by inducing CPU/memory load or spamming log lines to see it reacts in Kibana, and correlate the different sources of information together in a single Kibana dashboard, providing a relatively complete view of what is happening on the system.

Finally, anything missing we want to monitor or record we can do by writing our own Filebeat or Metricbeat module. So we will take a dive into the code to see how you can contribute your own Filebeat or Metricbeat module to these projects.

About Josh Rich:

Josh is a technical support engineer with Elastic, which means he helps people do awesome things with the Elastic stack everyday. He joined Elastic from a background in scientific research and high-performance computing. Ex Gentoo-er, now Fedorian. Ops more than dev but likes to dabble in all the things.

Revisiting Sysadmin and Wordpress - Scripting/automating - Liz Quilty

Wordpress is often installed by less experienced users, following a guide they found online which may well be out of date and insecure. By the time Sysadmins are involved, it is often cleaning up the result of those inexperienced choices. However there's a useful proactive role for Sysadmins too, to avoid it being "all cleanup".

This talk is going to show a few shell scripts I created to make the updating and fixing exploits and faster easier. As well as scripts for reinstalling exploited installs, checking they are up to date, and a few other things.

About Liz Quilty:

Linux System Administrator for 20 years, Currently working for Rimuhosting. I enjoy trying to find ways that people less technical can help themselves, automating as much as possible.

MQTT as a Unified Message Bus for Infrastructure Services - Matthew Treinish

Development and testing of the OpenStack project operates at a tremendous scale, with hundreds of code repositories and thousands of contributors interacting continuously. The infrastructure to support this has to operate at an equally large scale to ensure that it is not outpaced by the volume of upstream development activity. Enabling users and other consumers to see what is happening in real time in this increasingly complex infrastructure becomes equally complex and large. This is why we need interfaces available to develop tooling to handle this.

MQTT best known for it's use in IoT and sensor network applications also provides a number of advantages when used as an event bus for infrastructure services. For the OpenStack developer and community infrastructure we introduced firehose.openstack.org to provide a unified message bus. Built using MQTT and Mosquitto, firehose is an interface where services can publish events.

This talk will cover how you can use MQTT as a unified event bus for infrastructure services. It will explain some basics of the MQTT protocol and why it's well suited for this application. It will also use the OpenStack community infrastructure's firehose as a case study to explain the benefits and how a similar system can be used for their your own needs, experimentation and innovation.

About Matthew Treinish:

Matthew has been working on and contributing to Open Source software for most of his career. He has been primarily contributing to OpenStack since 2012 and is a former member of the OpenStack TC (Technical Committee) and was previously the PTL (project technical lead) of the OpenStack community's QA program from OpenStack's Juno development cycle in 2014 through the Mitaka development cycle in 2016. He is a core contributor for several OpenStack projects and a member of the OpenStack Stable Maintenance Team. Matthew currently works for IBM's Developer Advocacy team working to make Open Source software better for everyone. He has previously been a speaker at OpenStack summits, LinuxCons Japan, China, and North America, OpenWest, FOSSASIA, PyConAU's OpenStack miniconf.

The New Old Thing: Dynamic Service Discovery with DNS - Matt Palmer

With containers, orchestration, and distributed microservices being all the rage, service discovery (the art of finding what to talk to) is more important than ever. Many technologies and systems have been developed, and are under active development, to solve this problem. But let's take a step back, and take a look at the oldest service discovery system of them all: DNS.

There is a standard for service discovery using DNS: DNS-SD, RFC6763. Why isn't anyone using it? Or perhaps they are, and we don't know it because it Just Works? We'll dissect the guts of this protocol, examine its strengths and weaknesses, compare it to some other popular service discovery systems, and based on practical operational experience, decide if everyone should be adopting this quiet achiever.

About Matt Palmer:

Matt is a beardy Unix guy who has been convincing computers to do things for a very long time. Part developer, part sysadmin, part manager, he has seen a lot of things, and once started on a topic, is very difficult to stop.

Designing scalable production Kubernetes clusters on AWS - Nick Young

When you’re building a Kubernetes cluster that can scale, there are some ways in which your choices can affect you in ways that are not immediately obvious.

This talk uses our experience in building a platform that fits as many use cases as we can find inside Atlassian to talk about how we found these limitations, what we did about them, and to build some rules of thumb for designing similar platforms.

In particular, I’ll be talking about the importance of building well-demarcated layers, and the ways in which your networking decisions can introduce scaling constraints.

You should come away from this talk with an understanding of why we made the decisions we did about layering and networking, and with some insights for your own Kubernetes deployments, either on AWS or elsewhere.

About Nick Young:

Nick has been working to prevent the entropic downfall of systems for 20 years, across Windows and Linux, datacenters and clouds, networking, storage and compute. Currently, he's a Principal Engineer in Atlassian's Kubernetes Infrastructure Technology Team, where in addition to his primary tasks of Knight Rider puns, and Simpsons quotes, he builds Kubernetes platforms on AWS. In his spare time, he spends time with his young family, then with whatever's left he does his best to maintain his jack-of-all-geeks card, tinkering with his home setup, playing video games, watching TV movies, and anime, and reading as much as he can.

Catch him on twitter @youngnick

Funny FOSS war stories from the pages of The Register - Simon Sharwood

Heard the one about the Linux user who let a Windows tech support scammer log onto their PC, then laughed as the scammer wondered why nothing looked like Windows? Or the Linux user so upset by being forced to use MS Outlook that they accused a sysadmin of deleting their email?

Those stories, and more, have been told to The Register and APAC Editor Simon Sharwood would like to share them with you in ten minutes of mirth.

About Simon Sherwood:

Simon Sharwood is the Asia-Pacific Editor of The Register

Quick Introduction to OpenNMS - Tarus Balog

OpenNMS is the most powerful network monitoring platform of which you've never heard about. Started in 1999, the focus of OpenNMS is scalability, with its aim to be the de facto network monitoring platform for the Internet of Things.

This short talk with cover the high points of OpenNMS, from managing events, collecting data, monitoring services and provisioning the system at scale. It will also touch on upcoming features included flow analysis and bringing machine learning to network monitoring.

OpenNMS is 100% free and open source software (there is no "enterprise" version with extra features) and it is used by some of the largest organizations in the world. Come see if OpenNMS is a good fit for yours.

About Tarus Balog:

Tarus Balog has been involved in managing communications networks professionally since 1988, and unprofessionally since 1978 when he got his first computer - a TRS-80 from Radio Shack. Having worked as a network management consultant for many years, he was constantly frustrated in the lack of flexibility involved in commercial solutions from such companies as HP and IBM, as well as shocked by their high prices. Looking for a better solution, he turned to open source and joined the OpenNMS project in 2001 and become the principal administrator of the project in 2002. Since then he has managed not only to make a living working with free software, but the OpenNMS Group, the services company behind the project, has thrived. He is an outspoken evangelist for open source software and the communities it inspires.

Cost-Effective Virtual Petabyte Storage Pools - Thomas Schoebel-Theuer

Background migration of logical volumes (LVs) during operation via MARS is the key for low-cost virtual storage pools on thousands of servers, reducing server hardware and networking costs because no expensive O(n^2) storage network is needed anymore. It also increases reliability for many usage scenarios when compared to similarly sized big storage clusters. First experiences from 1&1 Internet SE, Shared Hosting Linux, are reported.

About Thomas Schoebel-Theuer:

Thomas Schoebel-Theuer has been working in operating systems since the 1980s, both in academia and in industry. He is an old-school Linux kernel hacker, inventor of the dentry cache for speeding up metadata access to filesystems. Currently he is working in industry on long-distance replication of petabytes of data, and on background migration in big sharding clusters / grids.

Principles Of Good Monitoring - Troy Lea

Deploying a monitoring system can be a daunting process. If you're starting out from scratch you'll come across options like active vs passive, performance metrics, log files, bandwidth and SNMP. You can easily get lost trying to make sense of it all. This talk will discuss why each method exists and what open source solutions are available for each method. Nagios Core and its open source modules will be referenced in the talk, however other open source agents used by Nagios will also be discussed. Log server monitoring will discuss the ELK stack. Netflow monitoring will discuss nfcap. The topics covered in this talk can be applied to any open source monitoring system.

About Troy Lea:

Troy is a self-described jack of all trades, this skill is required when monitoring systems is your passion. He has been an Independent Contractor for Nagios Enterprises LLC since April 2014 and works remotely from Deniliquin NSW.

Figuring out how to monitor things and then writing said documentation is something he takes great passion in. Being able to save someone else from re-inventing the wheel is very satisfying.

When he's not working there is always a project of sorts on the go. Like restoring a 1967 International Harvester bus so he can go north during winter to work remotely-remotely. He also enjoys water skiing / wakeboarding, teaching kids to water ski / wakeboard, walking his dog and Call of Duty.

linux.conf.au Systems Administration Miniconf Sydney, January 2018

LCA2018 Sysadmin Miniconf Presentations

Presentation Titles

Full Abstracts

linux.conf.au Systems Administration Miniconf
Sydney, January 2018